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ANHA Series Preface 


The Applied and Numerical Harmonic Analysis (ANHA) book series aims to provide 
the engineering, mathematical, and scientific communities with significant develop¬ 
ments in harmonic analysis, ranging from abstract harmonic analysis to basic appli¬ 
cations. The title of the series reflects the importance of applications and numerical 
implementation, but richness and relevance of applications and implementation 
depend fundamentally on the structure and depth of theoretical underpinnings. Thus, 
from our point of view, the interleaving of theory and applications and their creative 
symbiotic evolution is axiomatic. 

Harmonic analysis is a wellspring of ideas and applicability that has flourished, 
developed, and deepened over time within many disciplines and by means of 
creative cross-fertilization with diverse areas. The intricate and fundamental rela¬ 
tionship between harmonic analysis and fields such as signal processing, partial 
differential equations (PDEs), and image processing is reflected in our state-of-the- 
art ANHA series. 

Our vision of modern harmonic analysis includes mathematical areas such as 
wavelet theory, Banach algebras, classical Fourier analysis, time-frequency analysis, 
and fractal geometry, as well as the diverse topics that impinge on them. 

For example, wavelet theory can be considered an appropriate tool to deal with 
some basic problems in digital signal processing, speech and image processing, geo¬ 
physics, pattern recognition, biomedical engineering, and turbulence. These areas 
implement the latest technology from sampling methods on surfaces to fast algo¬ 
rithms and computer vision methods. The underlying mathematics of wavelet theory 
depends not only on classical Fourier analysis, but also on ideas from abstract har¬ 
monic analysis, including von Neumann algebras and the affine group. This leads 
to a study of the Heisenberg group and its relationship to Gabor systems, and of the 
metaplectic group for a meaningful interaction of signal decomposition methods. 
The unifying influence of wavelet theory in the aforementioned topics illustrates the 
justification for providing a means for centralizing and disseminating information 
from the broader, but still focused, area of harmonic analysis. This will be a key role 
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of ANHA. We intend to publish with the scope and interaction that such a host of 
issues demands. 

Along with our commitment to publish mathematically significant works at the 
frontiers of harmonic analysis, we have a comparably strong commitment to publish 
major advances in the following applicable topics in which harmonic analysis plays 
a substantial role: 


Antenna theory 
Biomedical signal processing 
Digital signal processing 
Fast algorithms 
Gabor theory and applications 
Image processing 


Prediction theory 
Radar applications 
Sampling theory 
Spectral estimation 
Speech processing 
Time-frequency and 


Numerical partial differential equations time-scale analysis 

Wavelet theory 


The above point of view for the ANHA book series is inspired by the history of 
Fourier analysis itself, whose tentacles reach into so many fields. 

In the last two centuries Fourier analysis has had a major impact on the devel¬ 
opment of mathematics, on the understanding of many engineering and scientific 
phenomena, and on the solution of some of the most important problems in mathe¬ 
matics and the sciences. Historically, Fourier series were developed in the analysis 
of some of the classical PDEs of mathematical physics; these series were used to 
solve such equations. In order to understand Fourier series and the kinds of solu¬ 
tions they could represent, some of the most basic notions of analysis were defined, 
e.g., the concept of “function.” Since the coefficients of Fourier series are integrals, 
it is no surprise that Riemann integrals were conceived to deal with uniqueness 
properties of trigonometric series. Cantor’s set theory was also developed because 
of such uniqueness questions. 

A basic problem in Fourier analysis is to show how complicated phenomena, 
such as sound waves, can be described in terms of elementary harmonics. There are 
two aspects of this problem: first, to find, or even define properly, the harmonics or 
spectrum of a given phenomenon, e.g., the spectroscopy problem in optics; second, 
to determine which phenomena can be constructed from given classes of harmonics, 
as done, for example, by the mechanical synthesizers in tidal analysis. 

Fourier analysis is also the natural setting for many other problems in engineer¬ 
ing, mathematics, and the sciences. For example, Wiener’s Tauberian theorem in 
Fourier analysis not only characterizes the behavior of the prime numbers, but also 
provides the proper notion of spectrum for phenomena such as white light; this latter 
process leads to the Fourier analysis associated with correlation functions in filter¬ 
ing and prediction problems, and these problems, in turn, deal naturally with Hardy 
spaces in the theory of complex variables. 

Nowadays, some of the theory of PDEs has given way to the study of Fourier 
integral operators. Problems in antenna theory are studied in terms of unimodu- 
lar trigonometric polynomials. Applications of Fourier analysis abound in signal 
processing, whether with the fast Fourier transform (FFT), or filter design, or the 
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vii 

adaptive modeling inherent in time-frequency-scale methods such as wavelet theory. 
The coherent states of mathematical physics are translated and modulated Fourier 
transforms, and these are used, in conjunction with the uncertainty principle, for 
dealing with signal reconstruction in communications theory. We are back to the 
raison d’etre of the ANHA series! 


John J. Benedetto 
Series Editor 
University of Maryland 
College Park 
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Classical harmonic analysis studies problems related to series expansions of 
signals or functions using trigonometric polynomials. The theory of Fourier series 
and Fourier integrals forms the core of harmonic analysis and extends from there 
to other mathematical areas such as the theory of singular integrals, approximation 
theory, and sampling theory, just to mention a few. Harmonic analysis is also used in 
numerous applications where it can be thought of as the mathematical backbone for 
a large number of modern methods in signal analyis and signal processing as well as 
image analysis and image processing. Its internal growth has seen generalizations 
to nontrigonometric expansions and noncommutative group settings, but its basic 
role in other areas of mathematics (differential equations, number theory, probabil¬ 
ity theory, and statistics), physics and chemistry (wave phenomena, crystallography, 
and optics), financial analysis (time series), medicine (tomography, brain and heart 
wave analyses), and biological signal processing has made harmonic analysis the 
main fundamental contributor to all of 20th century’s human-based technologies. 
These include telephone, radio, television, radar and sonar, satellite and wireless 
communications, medical imaging, the Internet, and multimedia. 

The applications of harmonic analysis to medical image processing have been 
undergoing a rapid change primarily driven by better hardware and software. Part 
of this development is an attempt by researchers to base medical engineering princi¬ 
ples on solid and rigorous mathematical foundations, and to develop mathematical 
methods that allow the creation of effective software programs that reduce or replace 
invasive medical procedures. 

Approximation theory and harmonic analysis benefit from each other. The 
latter provides the means that the former uses to approximate complicated func¬ 
tions or signals and surfaces or images, and to estimate the errors of this approxima¬ 
tion. On the other hand, harmonic analysis problems often require methods or input 
from approximation theory. Like harmonic analysis, approximation theory has seen 
decades of rapid development and growth, again, primarily driven by applications, 
such as computer-aided geometric design (CAGD) and its various ramifications. 

Recently, a great deal of emphasis has been put into the digitization, transmission, 
and processing of three-dimensional data sets. One-dimensional methods developed 
in harmonic analysis and approximation theory in the past do not easily carry over to 


IX 


X 


Preface 


this higher-dimensional setting. Instead, new ideas and methods need to be found to 
take into account the nonisotropy and nonhomogeneities inherent in such data sets. 
In order for these generalizations to take place, new ideas from lower-dimensional 
problems need to be reconsidered. As an example, we take the effective design of 
wave forms that is essential to the simultaneous transmission of clear messages on 
the same frequency band. Constructive approximations of unimodular sequences 
whose autocorrelations vanish on prescribed sets are introduced, and their analysis 
depends signifantly on Wiener’s generalized harmonic analysis (see [19]). 

Signal analysis and image analysis have greatly benefited from the theory of 
wavelets and their generalizations to frames. These multiscale methods use repre¬ 
sentations based on two specific groups that are used to transfer information between 
the scales and within each scale. It has become clear that for multidimensional data, 
more general groups and multiscale methods need to be employed. The geometry 
involved in such a high-dimensional setting is more complicated and challenging 
than in the one-dimensional case, as spatial and, in the video setting, even temporal 
features need to be taken into account. A first step toward such an improvement in 
representation is undertaken in [130, 233]. 

This advanced textbook is intended for graduate students, pure and applied 
mathematicians, mathematical physicists, and engineers working in image/signal 
processing and communication theory. The book may be used in an advanced topics 
course or in a seminar on harmonics analysis and its applications to image and signal 
analysis. The prerequisites are a solid background in linear algebra and real analysis 
and knowledge of the fundamentals of functional analysis and metric topology. 

Chapters 2, 3, 4, and 5 in this book are based on lectures given by their authors 
at the summer school on New Trends and Directions in Harmonic Analysis, 
Approximation Theory, and Image Analysis, which took place in Inzell, Germany, 
from September 17-21, 2007. One of the goals of this summer school was to bring 
together a distinguished group of highly established international researchers to 
present their latest cutting-edge research, and, in conjunction with a small group 
of scientists including young researchers, to establish new and exciting directions 
for future investigation into the topics described above. 

A short introduction to the mathematical aspects of time-frequency analysis 
paves the way for the above-mentioned chapters. The reader is exposed to the main 
themes presented in this book and provided with a summary of those mathematical 
notions and concepts needed to fully appreciate the contents of Chapters 2 to 5. In 
addition, the material in these chapters is put into perspective in this introductory 
chapter. 

Chapters 2 to 5 were written by internationally renowned mathematicians and 
have an expository and interdisciplinary character, allowing the reader to understand 
the theory behind modern image and signal processing methodologies. In detail, the 
chapters cover the following. 

Ole Christensen considers B-spline generated frames. He exploits the flexibil¬ 
ity of frames and combines them with the elegant representations for B-splines. 
In the first part of his chapter, he introduces the terminology of Bessel sequences, 
Riesz bases, and frames and exhibits their central properties. In the second part, he 


Preface 


xi 


considers concrete constructions for Gabor systems and other tight frames, before 
he finally deduces the wavelet frames generated by B-splines via the so-called 
unitary extension principle. 

Demetrio Labate and Guido Weiss consider the theory and applications of 
composite wavelets. They first describe the unified theory of reproducing systems, a 
simple and flexible mathematical framework to characterize and analyze wavelets, 
Gabor systems, and other reproducing systems in a unified manner. These systems 
can be rewritten as a countable family of translations applied to a countable collec¬ 
tion of functions. The authors then define wavelets with composite dilations, a novel 
class of reproducing systems that provide truly multidimensional generalizations of 
traditional wavelets, and discuss so-called shearlets as a special case of optimally 
sparse representations for 2D. Applications in edge detection and considerations on 
the continuous analogues of composite wavelets are also considered. 

Pierre Vandergheynst and Yves Wiaux introduce wavelets on the sphere and 
therefore leave the classical Cartesian space. For many applications such as astro¬ 
physics, geophysics, neuroscience, computer vision, and computer graphics, data 
are given as functions on the sphere. In all these situations, one is compelled to 
design data analysis tools that are adapted to spherical geometry, for one cannot 
simply project the data into Euclidean geometry without having to deal with severe 
distortions. The authors provide a generalization of the wavelet transform to signals 
on the sphere. This generalization is not trivial, as the dilation operator is not well 
defined on the sphere. In addition, any algorithm faces the problem of how to sam¬ 
ple data on the sphere. This chapter discusses some recently developed methods for 
the analysis and reconstruction of signals on the sphere with wavelets, on the basis 
of theory, implementation, and applications. 

Karlheinz Grochenig gives various new and interesting aspects of Wiener’s 
Lemma. This result is one of the main theorems of Banach algebra theory. In 
the first part of his chapter, he discusses Wiener’s Lemma in detail and investigates 
equivalent formulations for convolution operators. In the second part, he considers 
various variations, especially in noncommutative settings. He also shows the impor¬ 
tance of the lemma for time-varying systems and pseudodifferential operators and 
concludes with applications in mobile communications. 

One of the main features of this book is its emphasis on the interdependence of 
these four modern research directions. Each chapter ends with exercises that allow 
for a more in-depth understanding of the material and are intended to stimulate the 
reader to further research. 

We would like to thank the VolkswagenStiftung for generously providing the 
funds and support for the summer school on New Trends and Directions in 
Harmonic Analysis, Approximation Theory, and Image Analysis in Inzell, Germany. 

Our thanks also go to the Institute for Biomathematics and Biometry at the 
Helmholtz Zentrum Miinchen and the Centre of Mathematics, Research Unit 
M6 - Mathematical Modelling, at the Technische Universitat Miinchen. 

We also would like to acknowledge that this work was partially supported 
by the grant MEXT-CT-2004-013477, Acronym MAMEBIA, of the European 
Commission. 
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We wish to express our gratitude to Birkhauser and its professional staff, in 
particular, Tom Grasso, Regina Gorenshteyn, and Patrick Keene, for their support 
and help during the preparation of this book. 

And last but not least, we heartily thank John Benedetto, who had the initial idea 
for this book and invested the time and energy to launch it. 


Munich, Germany Brigitte Forster 

August 2009 Peter Massopust 
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Chapter 1 

Introduction: 

Mathematical Aspects 
of Time-Frequency Analysis 


Peter Massopust and Brigitte Forster 


Abstract Time-frequency analysis of signals or images deals with mathematical 
transforms of continuous or discrete data with the aim of having more information 
accessible after than before the transform. The possible choice of the respective 
transform strongly depends on the mathematical model of the signal or image 
source. The modeling scheme affects which analyzing transforms can be applied 
in a mathematically sensible way, e.g., the mapping should be continuous, and also 
which transforms give access to new and, in particular, well-formulated interpreta¬ 
tions of the data. 

In this chapter, we present the ideas behind the optimal choice of an appropriate 
modeling scheme, the standard signal and image models, and the most important 
mathematical analysis transforms. This covers aspects of Fourier series and inte¬ 
grals, sampling or discretization problems, and various windowed transforms, such 
as the short-time Fourier transform, the Gabor transform, and wavelets. 

The chapter gives an introduction to the main mathematical terms used in the 
subsequent four chapters of the book. It shows their relation and interplay and gives 
entrance points to the lectures presented in subsequent chapters. We provide cita¬ 
tions to references for further and more in-depth reading and conclude with a list of 
exercises. 


1.1 Aims of Time-Frequency Analysis 

Time-frequency analysis deals with the characterization and manipulation of signals 
whose frequency components vary in time. By a signal, we understand a complex¬ 
valued functional / : X —> C, where X is a Banach or Hilbert space and C denotes 


Peter Massopust 
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the field of complex numbers. The choice of the time domain X determines different 
types of signals. For instance, 

• X := R describes a time-continuous signal; 

• X:=ZorX:=Na discrete signal in time; 

• X := [a,b], —oo < a <b < +°°, a signal that is time-limited. 

Among these signals, one also distinguishes the following classes: 

• T -periodic signals: f(t) = f(t + kT), T G M + , and k G Z; 

• finite energy signals: / G L 2 (M) or / G / 2 (Z); 

• bounded signals: / G L°°(M) or / G /°°(Z); 

• integrable or summable signals: / G L 1 (M), resp., / G Z 1 (Z). 

In many applications, signals are measured in order to use them to monitor or 
regulate a time-varying process, or to ensure and manage its quality. The measured 
signals must be brought into a form that allows for efficient and quick evaluation and 
interpretation. In order to achieve this, a signal / is transformed so that its image 
/ under the transform is more easily interpretable and analyzable. In particular, / 
should be such that any unwanted noise can be filtered out or removed, and the 
characteristic system parameters can be estimated^ In addition, more information 
should be extractable from the transformed image / than from / itself. 


1.1.1 Signal and Model 

There are many families of functions that can be used to analyze a signal by means 
of a series expansion or an integral transform. Which family of functions is to be 
chosen to analyze a signal depends on the underlying model for the system that gen¬ 
erates the signal. For instance, for time-limited signals a different function family 
is used than for time-unlimited signals, and for time-continuous signals a different 
one than for time-discrete signals. However, the procedure is the same in each case. 
Figure 1.1 shows a schematic sketch of this procedure. 

One of the first analyzing systems was developed by J. B. J. Fourier and is known 
as the Fourier series. Originally, Fourier was interested in finding an elegant way to 
solve the heat equation. He expanded the initial value function or distribution in 
a series of complex exponentials and then could easily derive the solution from 
the series’ coefficients [162, 200, 223]. In fact, the Fourier series of a function 
/ G L p [—n,n], 1 < p < oo, or / G C[— 7r,7r] is given by 


/~ I me*' 


( 1 . 1 ) 



where 
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Fig. 1.1: The different meth¬ 
ods for signal analysis employ 
the same scheme. Coefficients 
are associated with a signal, 
and these coefficients are 
transformed. From these new 
coefficients, a transformed 
signal is reconstructed. 



are the Fourier coefficients. They allow the following interpretation: \f(k)\ is the 
amplitude and arg(/(&)) is the phase corresponding to the frequency k G Z. 

The series (1.1) converges in norm for the //-spaces with 1 < p < ©o, such that 
the Fourier transform ^ : LP[ —n,n] —> co(Z),f —» {f{k)}kez, is a linear, contin¬ 
uous, and on its image continuously invertible operator. [Here, co(Z) denotes the 
space of all sequences with domain Z converging to zero.] For L l [—n,n] and 
C[—n, n], convergence of (1.1) is attained with so-called approximate identities 
[151, 162, 248]. 

A drawback of the Fourier series for signal analysis is that a local change of the 
function / results in a global change of the coefficients. Therefore, mathematical 
interest began to focus more on localized transforms, such as, e.g., the Haar system 
(see Section 1.1.2.3) or the Rademacher system (see Section 1.1.2.4), and more 
recently wavelet multiresolution analyses (see Section 1.5.3). 

The same drawback as for the Fourier series applies to integral transforms, 
such as, e.g., the Fourier transform for L 2 (R") functions (see Section 1.3). To 
cope with this problem, local transforms such as the short-time Fourier transform 
(Section 1.4.1) and, as a special case, the Gabor transform (Section 1.4.2) were 
developed. 


1.1.2 Transforms 


For the better interpretability of a signal, a whole variety of possible transforms may 
be considered. But the question is, which ones are appropriate? One major point is 
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that the transform should not hide or cut off any information. This means that if, 
in the diagram in Fig. 1.1, we choose the identity operator as the “manipulation,” 
i.e., we leave the coefficients as they are, then the measured signal and the output, 
the utilizable signal, should coincide. This fact gives rise to several mathematical 
requirements: 

• The transform should be continuous: Quantitatively small changes in the signal 
should cause only quantitatively small effects in the transform’s image. 

• The transform should be continuously invertible. 

• There should exist an invertible discrete version of the transform. 

• There should exist a stable numerical algorithm. 

As examples, we review three common transforms based on Fourier series, Haar 
wavelets, and Rademacher functions. 


1.1.2.1 Fourier Series 

Let T := {z G C | \z\ = 1} = {e lt \ t G [0,2/r)} be the torus. Then T = S 1 is a compact 
subset of M 2 and a commutative group with respect to multiplication. Since the 
multiplication • : T x T —> T is continuous with respect to the metric topology on 
W 1 , T is a topological group. 

Every function / : T —► C can be uniquely identified with a 2^-periodic function 
/ : M —> C via f(t + 2 nn) = f(t) = f(e lt ), where t G [0,2/r), /iGl The Lebesgue 
measure on [0,2/r) is mapped onto T via 

r r2n . r2n _ 

/ f(z)dz= / f{e l, )dt = / f{t)dt, 

J T JO JO 

provided the Lebesgue integral exists. We denote by L 2 (T) the space of all complex¬ 
valued functions on T that are square-integrable with respect to the Lebesgue 
measure. Endowed with the inner product 

(f,g) ■= ^ J T f(z)g(z)dz, for/,geL 2 (T), 

L 2 (T) becomes a Hilbert space. 

Remark 1.1. Let 1 < p < ©o. ThenZ/(T) = where 

jJfP(T) = |/ : T —> C |/ Lebesgue-measurableand J \ f(t)\ p dt < °° 

and = {/ : T -> C | / = 0 a.e.}. 

Convention: In the following, we identify / and /. 

The complex trigonometric system {e in *} n ^z is an orthonormal basis for 
the Hilbert space L 2 (T). (This can be easily verified by a direct computation.) 
Orthogonality then implies that the family is minimal and complete. It can be shown 
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(see Section 1.2.1) that this implies that each / G L 2 (T) has a unique Fourier series 
representation of the form 

/-XC/^V"* in r 2 (T), 

n£Z 

where 

t f,e in ') = T f_J(t)e~ int dt , n G Z. 

The Parseval equality, which is given in (1.10) in a general form, implies that 

T f \f(t)\ 2 dt= X |</,0| 2 , V/GL 2 (T), 

Z7r7 -^ ne Z 

and that the mapping 


T :L 2 (T) /h-> {(/,e ! "*)}„ e z 

is a Hilbert space isomorphism. 

Therefore, we have the following result. 

Theorem 1.2. 77z£ trigonometric system {e in9 } n ^z I s complete in L 2 (T). 

Proof. We employ the WeierstraB theorem. Suppose that / G L 2 (T) is an integrable 
function with the property that / ^ span{^ m *} ne ^. Then 


(/, ^) = T /(f ) A = 0, \/n e z. 


We show that / = 0 almost everywhere. To this end, let 


g(t)= [ f(u)du, 
J-n 


for t G [ —7T, 7r]. 


Note that g is continuous. Suppose that c G C is a constant. Integration by parts 
yields 


[ K (g(t)-c)e- int dt = 0, 
J-n 


for all n G Z\ {0}. 


( 1 . 2 ) 


Now choose c such that (1.2) also holds for n = 0 and set F(t ) := g(t) — c. Then F 
is continuous on [—7r,7r] and F(n) = F(—n ) = —c. The WeierstraB theorem now 
implies that for all £ > 0 there exists a trigonometric sum 


= X C k e ‘ 

k=—N 


ikt 


so that 


\F(t) -T{t)\ < e, for |f| <n. 
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Thus, 

r | l 2 = ^/_V ( 0 |2 ^ = ^/_^ ( ^ ( 0-r ( 0 ) ^<^/_jr ( f )| ^<e || r’ 11 - 

Hence, ||F|| < £ and since F was arbitrary, we have that F = 0. Thus, g = c and 
therefore / = 0 almost everywhere. □ 

Remark 1.3. Analogously, one establishes the completeness of {e in '} n(E z in LP{ T) 
for all 1 < p < oo. 

The results obtained in this section can be summarized in the following theorem. 

Theorem 1.4. The trigonometric system {e m '} n(E z is an orthonormal basis for the 
Hilbert space L 2 (T). 


1.1.2.2 Convolution 

Denote by L ! (T) the Banach space of all complex-valued Lebesgue-measurable 
functions on the torus T endowed with the L 1 -norm. 

Theorem 1.5. Suppose that f^g E L^T). Then the function 

t^f{s-t)g(t) 

is absolutely integrable for almost all s E [— tt, tt]. Setting 

Ks) ■= K ps ~^ 8 ^ dt ' 

one has that h EL^T) and \\h\\i < ||/||i||g||i- The Fourier coefficients satisfy 

h(n) = f(n) -g{n) (1.3) 


for all n E Z. 

Proof. Exercise! □ 

Definition 1.6. Assume that /,g E L 1 (T). The a.e. defined function h : T —> C in 
Theorem 1.5 is denoted by f*g and is called the convolution of / and g. 

Example 1.7 (Filtering). Consider a 2/r-periodic signal / E L^T), i.e., a function 
/ : T —> C, that contains the frequencies nEZ, i.e., f(n) 0. 

A filter is a function g E L 1 (T) that removes certain frequencies I cZ and 
preserves all others: g(n) = 0, for n E /, and g(n) = 1, for n E Z \I. 

The filtered signal h = f*g contains on Z \I exactly the frequencies of / and on 
I no frequencies. 
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1.1.2.3 Haar System 

In 1910, Haar [131] introduced a Schauder basis for L p [ 0,1), 1 < p < °°, which is 
an unconditional basis if p > 1. (Details on Schauder bases and unconditional bases 
are explained in Section 1.2.1.) Let 


V f H{t) := 


'l, 0 <t<\\ 

-1, \ < t < 1; 

0 , otherwise. 


The Haar system is defined as the family of functions 

{2" /2 v/tf(2"r-q|fc = 0,l,...,2"; n e N 0 }. 

It is easy to verify that the Haar system satisfies the L 2 -orthogonality conditions 

[ 2 m / 2 xi/ H (2 m t-k)2 n ' 2 xi/ H (2 n t-l) = 8 mn 8jci, k,l e Z; m,n e N 0 . 

JR 


If (j) := X[o,i) denotes the characteristic function on [ 0 , 1 ), then one has the relation 

= 0 ( 2 ®) — 0(2 • ~ 1 ). 

The disadvantage of the Haar system is that it is not continuous. 


1.1.2.4 Rademacher System 


The Rademacher system [199] is given by the family of functions R := {r n \ n E No}, 
where 

r n (t) := sgnsin 2 n ^, t E [ 0 , 1 ). 

Here sgn : M —> M denotes the signum function 


sgn (t) := 



t 7 ^ 0 ; 

t = 0. 


It can be shown that the family R constitutes an orthonormal system for L 2 [0,1) with 
respect to the L 2 -inner product, but not an orthonormal basis. An obvious disadvan¬ 
tage of the Rademacher functions is their discontinuity. 

Both Haar and Rademacher functions have support contained in [0,1] and are 
orthogonal with respect to the L 2 -inner product, and both families are generated 
by a single function. The difference between the two systems lies in the way the 
family of functions is generated: For the Haar systems, one needs the dyadic dilates 
and integer-translates of i//#, whereas for the Rademacher system, only the dyadic 
dilates of rq are needed. 
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1.1.3 Signal Manipulations—Filters 

In order to use and process information obained from a measured or received signal, 
the sequence of real or complex values describing the signal must be manipulated 
to yield a utilizable representation. This manipulation of a signal is called filtering. 
Mathematically, it corresponds to transforming a given signal / into a new signal 
/ via a transform (operator) FF. The transform FF can be linear, as in the case of a 
bandpass filter (see Example 1.7), or nonlinear, as in the case of denoising, where 
coefficients whose values are in magnitude less than a given threshold are set equal 
to zero. As a simple example of a transform or filter FF, we mention the Fourier 
series introduced in Section 1.1.2.1. Here the signal is represented by its frequency 
content. 

Let ct t denote the time shift (by T > 0), i.e., <j T (/) :—/(• + t). A transform FF 
is called time-invariant if 

FF(o T f) = o T (FFf). 

Any linear time-invariant transform FF : L 2 (M) —> L 2 (M) acting on a signal 
/ G L 2 (M) can be represented as a convolution: 

(1-4) 

where the function F G L 2 (M) is usually called the impulse response of the signal 
/ (see [175, Chapter II]). The fact that F is in L 2 (M) is a consequence of the Riesz 
representation theorem. (Show this!) 

In case the signal / is representable by a sequence, i.e., / G / 2 (Z), and FF : 
/ 2 (Z) —> / 2 (Z) is again a linear time-invariant transform, the discrete analogue of 
(1.4) then reads 

^(f)(n) = (F*f)(n):= £ F(n-k)f(k), n e Z. 

k(EZ 


1.1.4 Why Discretizing? Techniques, Challenges, Pitfalls 


One approach for obtaining a numerical solution of a mathematical problem is to 
discretize it. The discrete problem can then, in many cases, be solved computation¬ 
ally efficiently. 

The general procedure of discretization is as follows. Let B\,B 2 ,B r {, and B'fi 
ft G N, be Banach spaces. Let FF be a linear, not necessarily continuous operator 
on B\. Suppose that for all ft G N, the operator FF n : B\ —► is linear and that 

exists and is linear. Moreover, assume that the discretization operators 
: B™ —» B", i = 1 , 2 , are linear. 

<? 

Bi ■ — > B 2 

G7-n 

B\ ——> B n 2 


(1.5) 
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The spaces B”,i= 1,2, are usually finite-dimensional, thus reducing the problem to 
a linear algebraic setting. 

From the above scheme one requires that the operators FT and Ff n are consistent , 
i.e., \\S# 2 ^(f) ~ || —> 0, as n —> °o. Note that consistency implies that 

diagram (1.5) commutes in the limit. If, in addition, the sequence of discretized 
operators {(^) _1 } w eN is uniformly bounded, then the discretization procedure is 
called stable. 

The next theorem gives conditions under which a discretization procedure yields 
utilizable results. 

Theorem 1.8. Let f E Bf and ff E Bbe such that f\ = f 2 and = f%, 

for all n E N. Suppose that \ | ^ 2/2 ~/2 II 0 as n °°- Furthermore, suppose 
that \\@ 2 &(f) — 3? n ^(f)\\ 0 as n 00 and that {(^) _1 }^gn A uniformly 

bounded. Then 

ll^i/i-/ril-0 asn^o°. 


Proof. Exercise! □ 

Remark 1.9. Theorem 1.8 can also be reformulated as: consistency plus stability 
implies convergence. 

In order to represent a continuous signal in a unique manner, the discretized 
continuous transform Ff n must yield a basis for the spaces B n . This is exemplified by 
the Fourier series and, in particular, by Theorem 1.4. Below, several other bases are 
described that come from discretized operators: the translates of the sinc-function 
in the sampling theorem (Section 1.3.4), the Gabor system (Section 1.4), and the 
wavelet system (Section 1.5). 

Time- or space-based measurements of signals yield only finitely many discrete 
values although the mathematical models are usually continuous. Applying the cor¬ 
rect discretization procedure is therefore very important, as is the analysis of the 
different types of measurement error. There are four basic types of measurement 
error: 

1. truncation error: arises if only a finite number of samples is taken into 
account; 

2. amplitude error: arises since in general the exact ordinate value of the sig¬ 
nal is not known but is contaminated by noise, or falsified due to round-off or 
quantization; 

3. time-/space-jitter error: arises if the sample points are not met correctly; 

4. aliasing error: arises if the signal is not exactly band-limited or the bandwidth 
is larger than assumed. 

For the precise analysis of these four types of measurement error, we refer the reader 
to [35-37] and the references given therein. 
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1.2 Basic Methods of Time-Frequency Analysis: Orthonormal 
Bases and Generalized Fourier Series 

In this section, we introduce Fourier series using general bases. In the following 
sections, concrete Banach and Hilbert spaces are considered. For this purpose, we 
first need to consider the concept of a function basis in Banach spaces. 


1.2.1 Schauder Bases in Banach Spaces 


In the following, X always denotes a separable Banach space and H a separable 
Hilbert space. The topological dual to X is denoted by X' and consists of all linear 
functionals </) : X —> C. Endowed with the operator norm 

H\\op= sup ye A', 

O^xeX \\X\\ 

INi<i 


X' becomes a Banach space. 

One of the most important concepts of a basis in analysis is that of a Schauder 
basis. 

Definition 1.10 (Schauder 1927 [135,246]). A sequence of elements in an 

infinite-dimensional Banach space X is called a Schauder basis for X if, for every 
x G X, there exists a unique sequence of scalars so that 


n 

■V- X CiXi 
i= 1 


0 as n —> oo. 


( 1 . 6 ) 


A Schauder basis is called bounded if 0 < inf ne ^ ||x„|| < sup nGN \\x n \\ < In case 
is a Schauder basis, the linear functionals 

/* : X - C, x = ^ c n x n t-^c k , k £ N, (1.7) 

are called coefficient functionals. 

Example 1.11. 1. The sequence spaces £ P (N), 1 < p <°°, have as a Schauder basis 
the canonical basis where the sequence 

e n = {0,...,0,1,0,...} 


has a “1” in the nth position. 

2. Banach spaces that possess a Schauder basis are separable. Since the sequence 
space ^°°(N) is not separable, it cannot have a Schauder basis. 

3. Orthonormal bases in a Hilbert space H are Schauder bases having the 

additional property that (e n ,e m ) = 8 nm and \\e n \\ = 1. 
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For Schauder bases and their coefficient functionals, the following theorem 
holds. 

Theorem 1.12. LetX be a Banach space and a Schauder basis for X. Then 

the following hold: 

1. ^ complete and minimal; i.e., the Schauder basis spans the space and 
the coefficients in a presentation x = a n x n are unique for allx^X [167]. 

2. The coefficient functionals {/^}^gN are continuous, hence elements of the dual 
space X'. Moreover, one has the estimate 

1 < \\x n \\-\\fn\\<K (1.8) 

for a positive constant K and all n G N. 

Proof 1. The first statement follows immediately from the definition. 

2. To prove the second statement [246], let Y be the vector space 


Y .— ^2 C 

endowed with the norm 


y c n x n converges in X 

n= 1 


(1.9) 


:= sup 

ftG N 


X c ‘ x i 

i=l 


Then Y is a Banach space that is isomorphic to X: The mapping 

T . Y ^X, 1 * c n x n 


n= 1 



oo 


n 

l|r{c„}„ eN || = 

y c n x n 

< sup 

y c t xi 


n= 1 

ne N 

i= 1 


is linear and since is a Schauder basis for X, it follows from (1.9) that T is 

bijective. Moreover, 


= IKcJhgnII : 


implying the continuity of T. Thus, by the open mapping theorem, T is a Banach 
space isomorphism. 

Let x = X^=i c n x n £ X. Then we have for every n G N, 

i /bW i = | Cb| = Ml < 


< 2sup,||Sf =1 c,-x,-|| 2||r~ 1 x|| < 2||r||- 1 ||x|| 
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Hence, 11 f n \ | < 211 T \ | 1 /1 \x n \\. This implies that f n is continuous and we have proven 
the right-hand side of the inequality. The left-hand side holds because of 

1 =fn(Xn ) = \fn{Xn)\ < \\fn\\ ' \\x n \\- □ 

Remark 1.13. Not every bounded, complete, and minimal sequence in a Banach 
space is a Schauder basis. Below, we will encounter an example of this fact. 

In order to compare Schauder bases in a Banach space, the concept of the equiv¬ 
alence of Schauder bases is needed. 

Definition 1.14. Two Schauder bases and in a Banach space are 

called equivalent if there exists a bounded, invertible operator T : X —> X such that 
Tx n = y n for all n G N. 

Every basis in a finite-dimensional vector space can be mapped onto the 
canonical basis via an invertible operator. In infinite-dimensional vector spaces, this 
is no longer true. In such spaces, the convergence with respect to a basis depends 
in general on the order of summation. This and the question on how to find the 
coefficients with respect to a Schauder basis will be considered next. 


1.2.1.1 Biorthogonality 

For the computation of the coefficients c n in an expansion of the form / ~ ^ n eZ c n x n , 
certain linear functionals y n with c n = y n (/) are used. These functionals are to satisfy 
the following condition. 

Definition 1.15. Two sequences {x n } ne z C X and {y n }n^z C X' are called biorthog- 
onal provided that 

ym(xn) = Smm for all m,n e Z. 

Using the Hahn-Banach theorems, it can be shown that in a Banach space a 
sequence {x n } ne z has a biorthogonal sequence if it is minimal and that this biorthog- 
onal sequence is unique if {x n } ne z is complete. 

The coefficient functionals satisfy additional properties. 

Theorem 1.16 ([167, 246]). 

1. If is a Schauder basis in a Banach space X, then the associated 

coefficient functionals are biorthogonal. 

2. Suppose that {x n } ne N is a Schauder basis in a reflexive Banach space X. Then 
the coefficient functionals {f n } n ^form a Schauder basis of the dual space X'. 

Proof. Exercise! □ 

The connection between Schauder bases and their coefficient functionals gives 
rise to the next definition. 
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Definition 1.17. Let X be a reflexive Banach space and C X a Schauder 

basis of X. The Schauder basis of X' consisting of the coefficient functionals 
{/rcjrceN is called the dual Schauder basis to C X. 

Example 1.18. Complete minimal sequences are not necessarily Schauder bases. 

To verify this statement, consider a Hilbert space H and an orthonormal 
sequence This orthonormal sequence is minimal and complete. The family 

{•*4^n c H g iven b y 

n 1 

X„ = X je k : n e N, 
k= 1 K 

is also bounded, complete, and minimal, but not a Schauder basis. For otherwise, 
the dual sequence with 


y n = ne„- (n + l)e n+h n G N, 

would also be a Schauder basis, hence complete. This, however, is not possible since 
/ := IneN T, e >i does not lie in the span of 


1.2.1.2 Unconditional Convergence 

In our consideration of Schauder bases, we have so far assumed conditional conver¬ 
gence: The Schauder basis consists of an indexed sequence and the convergence of 
a series with respect to this basis depends on the order of the basis elements. In the 
case of unconditional bases, the order of the basis elements is irrelevant. 

Definition 1.19. A series 'Z ne z^n in a Banach space X is called unconditionally 
convergent if every permutation a : N —> N of the series X^ez a o{n) converges to the 
same element in X. 

A Schauder basis {x n } ne z for X is called an unconditional basis for X if every 
convergent series of the form X^ez c n*n converges unconditionally. 

Remark 1.20. In signal analysis it is often customary to sum up the N largest manip¬ 
ulated coefficients and to consider the limit N —> °o to obtain an approximation of 
the signal. If conditional bases are employed, then it is no longer guaranteed that the 
result of the summation, i.e., the synthesized signal, is interpretable. The associated 
series is summed up using an unpredictable order and may not converge. 

Example 1.21. Recall that co(N) := {x E M N |lim^ooV w = 0}. The sequence 
{dn}ne N> Where 

* = ( 1 , 0 , 0 ,...), * = ( 1 , 1 , 0 ,...), * = ( 1 , 1 , 1 , 0 ,...), ..., 

is a conditional basis for the sequence space co(N). 

For suppose that is a sequence in M so that X«eN a n converges, but 

does not converge absolutely. Then the series X^eN a n d n is not unconditionally 
convergent. (Show this!) 
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In Hilbert spaces H , there are generalizations of unconditional bases in several 
aspects. The most important ones are the notions of a Riesz basis and of a frame. 
A frame in H is a redundant family of functions that spans H and fulfills 

the following stability condition: There exist constants A,B > 0 such that 

M\A \ 2 < \(x,x n )\ 2 < B\\x\\ 2 , Vx e H. 

n(E N 

It can be shown that frame representations of functions are unconditionally conver¬ 
gent. A Riesz basis is a frame, which is a Schauder basis, i.e., a minimal frame, and 
allows for unconditionally convergent series representations. Moreover, it is equiv¬ 
alent to an orthonormal basis. A detailed description on frames and Riesz bases 
is given in Ole Christensen’s Chapter 2. Absolutely and therefore unconditionally 
convergent Fourier series are explored in Karlheinz Grochenig’s Chapter 5. 


1.2.2 Generalized Fourier Series 

The most important property of orthonormal bases as compared to other bases is the 
simplicity of the terms in a basis representation. If {e n } ne -^ is an orthonormal basis 
in a Hilbert space H , then every / G H has a Fourier series representation of the 
form 

/= 

n= 1 

and this series converges in the induced norm on H. The inner product (f,e n ) is 
called the nth Fourier coefficient of /. The theorem of Pythagoras implies Parseval’s 
equality: 

imi 2 =£io>„>i 2 . (i-io) 

n= 1 

In particular, Parseval’s equality means that the linear mapping 

5: H —> Z 2 (N), f^{(f,e n )} nm 

is an isometric Hilbert space isomorphism. Therefore, S also preserves scalar prod¬ 
ucts and the weak Parseval equality holds: 


(f,g) = X(/>„)(g,e„), \/f,g€H. 

n= 1 

Example 1.22. In / 2 (N) is the canonical basis in Example 1.11, item 1, an 
orthonormal basis. 

Theorem 1.23. For every finite orthonormal system, we have that 

l\(f,e n )\ 2 <\\f\\ 2 . 

n= 1 
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This result has the immediate corollary 

Lemma 1.24 (Lemma of Riemann-Lebesgue in Hilbert spaces). 


lim (f,e n ) = 0, V/eff. 


1.3 The Fourier Integral Transform 

In this section, we introduce the Fourier integral transform on L 1 (M) and discuss 
some of its properties. In addition, we define the convolution between two integrable 
functions and the concept of a summation kernel. Extending the Fourier transform 
to a Hilbert space setting gives rise to the Plancherel transform, which is presented 
next. The theorem of Paley-Wiener and the Poisson summation formula conclude 
this section. 

1.3.1 Definition and Properties 

We consider time-continuous, Lebesgue-integrable functions / G L 1 (M). 

Definition 1.25. Assume that / G L l ( R). The Fourier transform &(f) of / is 
defined by 



for all co G M. 


Theorem 1.26. Assume that f,g G L 1 (M), co G M, A G C. 

1. ^ A linear : &( Xf + g)(cQ) = A^ (/)(ft)) + ^(g)(ft)). 

2. Let f(t) := f(t). Then 7F(f)(co) = 7F(—co). 

3. Let Lyf(t ) := /(t — s), /or s G M. TTzen (L 5 /)(a) = e~ lsco ^ (/) (co). 

4. |W)M| < ll/lk- 

5. Lef A(r) :=Xf(Xt),forX £l\{0}. Then&{f x ){<»)= &{f){<a/X). 

Proof. Exercise! □ 

Remark 1.27. In higher dimensions and / G L l (M n ), n> 1, one defines for WGi” 
the Fourier transform of / by 



Theorem 1.28. Suppose that f G L^M). Then the function J^(/) : M —> C A 
bounded and uniformly continuous. 

Proof. Exercise! □ 

Remark 1.29. Note that in case of the Fourier transform in L l ( T) we dealt with 
a discrete-frequency spectrum, whereas here in L 1 (M) we consider a continuous- 
frequency spectrum co G M. 
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Completely analogous to the case L ! (T), the Banach space L l (R) can be 
endowed with a convolution structure and thus becomes a Banach algebra. 

Theorem 1.30. Suppose that f,g€ L^R). For almost all t G R is the mapping 
s i—* f(t — s) • g(s) absolutely integrable. Let 



Then h G L 1 (R) rmd 


IN|l<||/HHH|l ^ ^(h)(C0) = ^(f)(C0)-^(g)(CD), Wco £ R. 


Proof. Analogous to Theorem 1.5. Exercise! □ 

Definition 1.31. Assume that /,gG L^R). The function h in Theorem 1.30 is 
denoted by / * g and called the convolution of / and g. 

Remark 1.32. The Banach space L ! (T) together with the operation of convolution 
becomes a commutative Banach algebra without unity. The Fourier transform is a 
Banach algebra homomorphism 


& : L : (T) —> /°°(Z), 


with respect to the Banach algebra Z°°(Z) with elementwise multiplication. 

Remark 1.33. The Fourier transform is an algebra homomorphism in C^(R), i.e., 
the Banach algebra of pointwise multiplication of all uniformly continuous (u) and 
bounded (b) functions on R. 

Theorem 1.34. Let f,keL l (R) and suppose that 



Then 


k*f(t)= f K(co)^(f)(co)e l(0t dco. 


Jr 


Proof. Exercise! □ 

Theorem 1.35. Let f e L l (R). 

1. Let 



for t G R. 


IfF eL l ( R), then 




for all co e R \ {0}. 
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2. Suppose that f is a differentiable function on R whose derivative f G L 1 ( 
Then 


Z£0 


/6>r a//fi)Gl \ {0}. 


Proof It suffices to show 1. We know that F'(^) = f(t) holds for almost all t G . 
Integration by parts and the dominated convergence theorem yield 


■?(F)((0)= UmF(t)-d-t 
A^oo —ico 


+ 


t=-A 


Jr ico 


*dt . 


Obviously, UmA-*-™ F (A) = 0. Since / is integrable, the limit lima-**, F (A) exists 
and is finite, for 


im F(A) = [ f(t)dt 
Jr 


lim 

A 


oo 


exists. 

Assume that liniA^ooF(A) = a^0. Then there exists Ao > 0 so that |F(A)| > 

| a | /2 > 0 for all A > Ao. This, however, is a contradiction to F e L l (R). □ 

Theorem 1.36. Suppose that f G L^R). Set g(t ) := tf(t), and assume that 
g G L^R). Then, for all (0 G R, we /zave is differentiable and 

on f))'(co)=^(-i g )(co ). 


Proof Consider 

h 






-dt. 


Now, 



< kl 


and 

_ i 

- -» — fovh— » 0 . 

h 

Since g(k) = tf(t) G L : (R) is absolutely integrable, the dominated convergence 
theorem yields 


W))» = -/ [ f{t)e- im tdt = -L?(g)(G>). □ 

Jr 

Induction implies a corollary to the above theorem. 

Corollary 1.37. Suppose f G L X (R) is such that t i—» t n f(t ) =: g(k) A absolutely 
integrable for ann £ N. TTien J^(/) A n-times differentiable and 

(^(/)) W (fl>) = (i)"&(g)(<0), for all ft) e R. 
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In particular, for co = 0, we obtain 

(^(/))W(0) = (-O"m n 


where r 

m n := / t n f(t)dt 
Jr 

is the nth moment of f. 

Corollary 1.38. Suppose that f : R —> C has compact support and is twice continu¬ 
ously differentiable. Then J^(/) G L X (R). 

Proof Since / has compact support and is an element of C 2 , the derivatives f and 
f" are also compactly supported. Theorem 1.35 implies that 


W)(®)l 


rn 2 



Divide R into the sets [ — 1,1] and R \ [ — 1,1 ]. On the compact set [ — 1 , 1 ] the contin¬ 
uous function J^(/) is bounded and thus integrable. On R\ [ —1,1] 
the function JP(f)(co) is also integrable since it decays like 1 /co 2 . Hence, 
□ 

Our next goal is to find approximations for the missing unity in the Banach 
algebra (L 1 (R), *). For this purpose, we need a definition. 

Definition 1.39. A summation kernel on R is a family {^a}ag( 0 ,oo) of continuous 
functions with the following properties. 

(SI) f R h(t)dt= 1, for all A G (0,oo). 


(S2) J M | kx(t)\dt <M, for all A G (0, °o ) and a constant M > 0. 


(S3) For all 8 > 0, we have that 

lim f \k^(t)\dt = 0. 

A— J\t\>8 

Example 1.40. 1. All summation kernels are found via the following procedure. 

Choose an / G L 1 (R) DC(R) with f R f{t)dt = 1 and set k^(t) := Xf(Xt). 

Then 

• f R K%(t)dt = f R Xf(Xt)dt = f R f(s)ds = 1. Hence, (SI) is satisfied. 


. f R | K X (; t ) \dt = f R \Xf(Xt)|dt = f R \f(s) | ds = ||/||i. Thus, (S2) holds. 


• For 8 > 0, we have 

/ \ K l(t)\dt=f | Xf(kt)\dt= f \f(s)\ds —> 0, asA^oo. 

J\t\>8 J\t\>8 J\s\>8h 


Therefore, (S3) is also valid. 
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2. The Fejer kernel: 


1 f sin — \ ^ 

F{t):= l kKIJi) ’ F ^) = ^F{Xt), A G (0,00). 

(^)a e (0,«») is called the Fejer kernel on M and it is a summation kernel. To verify 
this last claim, it suffices to show that J R F^(t)dt = 1. 

To this end, we use the Fejer kernel 


1 ( sin 


n + 1 l sin j 


i fi-^r 

+. V ra+1 


defined on T. It is easy to verify that 


1 r n 1 
27T J —n n + 1 


k=—n 


1 / sin 


Jkl 


dt = 1 


and 


Thus, 


1 r 2n ~ 5 1 (sin 2 ±i t 


■ Is 


lim ^ , | . * 

n^oo 2 k Jd n + 1 \ sm 4 


1 r s 1 / sin fip-t 


lim -L f° -L 

«-**> 2;r n+1 


Jr = 0. 


dt = l. 


sm; 


(l.H) 


For every 0 < £ < 1 exists a 5 > 0, so that for all |r | < <5, the following inequalities 
hold: 

K) 2 ^) 2 S (. +e )K) 2 . 

For such a 5 > 0, we thus have that 


1 1 r 5 1 /sin *±ir 


1 1 r° 1 

1 + £ 27T J — s n+1 


Jr 


sm; 


1 r 5 1 / sin ^r 


<+r_L 

27T J —5 n+1 


< 


1 r 5 1 / sin 2±±r 


1 r° 1 
2;r J-S n+1 


Jr 


dt. 


sin ^ 


( 1 . 12 ) 


Setting A : = n + 1 yields 

[ F(t)dt= f F n+1 (t)dt= [ F n+ \(t ) dt + f F n+ \(t) dt. (1.13) 
JR JR J-S JR\[-5,51 
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The second integral in (1.13) vanishes: 


Jr\[-S,S] 27T J-8 n+ 11 


dt 0 for n —» oo. 


Hence, 


1 r s 1 


[ F{t)dt= lim [ F n+ i(t)dt= lim — / —— 
7M n^ooJ_ § n^oo2n J-8 n+\ 

Equations (1.11) and (1.12) now imply that 


dt. 


1 1 r 5 1 /sin^A , 

< lim — / -- -— dt < 1. 

~ :7_ 5 n+l § j 


1 + e «-**> 2/r. 

Since 0 < £ < 1 was arbitrary, the claim that f R F(t)dt = 1 follows. 

Definition 1.41. A family (^a)ag( 0 ,oo) is called an approximate identity for L 1 (M) if 


lim H/-***/||i=0, V/eL‘(R). 

A—>°° 

It can be shown that every summation kernel on M is an approximate identity 
for 

Theorem 1.42. Assume that {^}ag(o,«>)» A a summation kernel. Then 
lim ||/ —** */||i = 0, V/eL^R). 

A —>°° 

Proof. See, e.g., [151, 162]. □ 

The special case k^ := F^ yields a corollary. 

Corollary 1.43. Let f El 1 (M). For a given A > 0, set 

°X(/)(0 */(0 = T (l - ^(f)((o)e im d(0. 

Then 

lim ||/-F**/|| i=0. 

A—»°° 

Proof. Only the validity of the second equality needs to be shown. For this purpose, 
define 


Mt) . = lAd-kl), f °rki<r 

J ' 1 0, else. 
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Then, for co 0, one has that 



and 


[° (1 +t)e~ im dt =- - K(e~ ia> - 1). 

7-1 ico CO 2 

Hence, for all co 0, 


1 f l 2-2cos (0 _ 1 / sin y \ 2 

2 k J- 1 co 2 27T \ j ) 

In addition, J?(A)(0) = \/2n = F( 0). Therefore, F = ■'F(A) and 
F x (w) = AF(Afli) = A^(4)(A«i) = 


F(ffl). 


where 


Thus, 



for \t\ < A, 
else. 




hL{' 


jfflj 

A 


'JOJ : 


hL{' 


I®! 

A 


f Jo), 


since A is an even function. 
Theorem 1.34 implies 


Fx *fif) = T &{f){co)e im d(D. 

Now, F\ is a summation kernel and according to Theorem 1.42 an approximate 
identity for L 1 (M), which implies the statement. □ 

Corollary 1.44 (Uniqueness). 

Let f G L 1 (M) satisfy JF(f)(co) = 0 for a// 0) G M. / = 0 almost everywhere. 

Corollary 1.45 (Inversion formula). 

Let f G L 1 (M) satisfy & (/) G L 1 (M). 77z^ 

f(t) = —— [ JF(f)(co)e l(0t dco, for almost all t G M, 

27T 7m 


with equality at the points of continuity of f. 
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Proof. For fixed tGl, the function 

R^C: (0 ^ X[-X,X](0J) (l 






Fig. 1.2: (a) The Fejer kernel F(t) and (b) its Fourier integral transform A((o) in ZA(R). 

(c) For scales A • Ffht) with increasing A = 1,2,3, the Fejer kernel becomes narrower and higher 
and (d) the Fourier integral transform wider. 

has as its limit the function 


(O &{f)(m)e iM , 

as A —> oo. The modulus of this function can be bounded above by |J^(/)|. As 
by assumption JP(f) E L l (M), the dominated convergence theorem in the Z^-norm 
implies 

Yimjx *f(t) = Um &{f){(o)e ia>t dm 

= f[ f)(co)e m dm. 

2/r Jr 

Note that if a sequence g n —> g in L 1 (M) and also \\g n — g|| —> 0 as n —> °o, then there 
exists a subsequence g njc such that 

lim g nk {t)=g{t) a.e. 

k—^oo 
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Applying this to the setting at hand, we deduce the existence of a subsequence 
{Fh k }keN with the property that *f —> f almost everywhere as k —> ©o. 

The statement regarding the points of continuity of / follows from a theorem 
ahead. □ 

Definition 1.46. For a g E L l (M), we denote the inverse Fourier transform of g by 



Remark 1.47. For functions / E L 1 (M) such that J^(/) E L 1 (M), one has that 

a. e . 


Example 1.48 (The Fejer kernel revisited). We already know that 4P(A) = F. Since 
^ (A) has compact support, ^(A) E L l (M) and the inversion formula 


A(g>) = F&(F)(<d) 


holds. 


Corollary 1.49 (Continuous analogue of the theorem of WeierstraB). Let C c (M) 

denote the space of all continuous functions M —> C with compact support. The set 
of functions f E L 1 (M) with 4P(f) E C c (^) forms a dense (with respect to the norm 
|| • || \) sub space of L l (M). 


Proof. 


(l-^i )w)(co), 


= &(F x )((o)^(f)(co) = Z[-X,A](®) 1 


i.e., */) G C c (M). The claim follows now from the fact that the Fejer kernel 

is a summation kernel. □ 

Corollary 1.50 (Riemann-Lebesgue lemma). 

Let f E L l (M). Then lim^i^, J^(/) (to) = 0. 

Proof. Use Corollary 1.49. The details are left to the reader. □ 

Theorem 1.51. Let f EL 1 (M) and define 



Then Ox (/) (t) —> f(t) as A —> ©°, /or almost all t E M. 

1ft is a point of continuity of f, then Ox(f)(t) converges to f(t). 


Proof. See, for instance, [162]. □ 
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Example 1.52 (Additional summation kernels on M). 

(i) Cauchy-Poisson kernel 

PW:= Kh^)’ A e (0,00), 

is a summation kernel (see Fig. 1.3). 

1 r 1 1 

— / --~ Jr = — arctan(r) = 1. 

7Tj]Rl+r 2 7T , = _oo 






Fig. 1.3: (a) The Cauchy-Poisson kernel P and (b) its Fourier transform in L 1 (R). (c) For scales 
A -P(Ar) with increasing A = 1,2,3, the Cauchy-Poisson kernel becomes narrower and higher, 
and (d) the Fourier transform wider. 


Hence, (Px)xe (0,°o) is a summation kernel. The Fourier transform is given by 


since 


&{ p x){®) - ex P 



fit) = Z[o,~ ){t)e \ ^if){( 0 ) = 

g(t)=X(-~fi](ty, &{g){co) = Y 2-. 






















1 Introduction: Mathematical Aspects of Time-Frequency Analysis 


25 


Set h(t) = f(t)+g(t) = e 1^. Then &(h)((o) =2/(1 + co 2 ). However, h , &(h) G 
L 1 (M) and application of the inversion theorem yields 


m)(©)» f Px{t)e~ im dt 
Jr 

= [ &{h){Xt)e~ im dt 

2n Jr 

= J— f ^(h)(s)e~ i f s ds 

2n Jr 

= = e-lfl. 

(ii) GauB kernel 


G(0 := x , GaW := AG(Af), A G (0,©o) 

(see Fig. 1.4). Note that f R G(t)dt = 1. Thus, {G^Ia+o,^) is a summation 
kernel. Moreover, 

^(G)(co) = exp(- (y) 






Fig. 1.4: (a) The GauB kernel G and (b) its Fourier transform in L 1 (R). (c) For scales A • G(Af) 
with increasing A = 1,2,3, the GauB kernel becomes narrower and higher, and (d) the Fourier 
transform wider. 
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1.3.2 The Plancherel Transform 

For many applications, a Hilbert space model is more appropriate since in this case 
an inner product is available. In this section, we extend the Fourier transform to the 
Hilbert space L 2 (M). 

To this end, we use the fact that C c (M), the space of all continuous functions with 
compact support, is a subspace of L 1 (M) D L 2 (M) and dense in L 1 (M) and L 2 (M) in 
the respective norm topology. 

Lemma 1.53. Iff E C c (M), 2^(/) E L 2 (M), 

2 - / 1 ^(/)(®)| 2 ^= / 1 /( 01 2 *. 

27T 7 m i]R 

Proof See, for instance, [209]. □ 

Definition 1.54. Suppose that / E L 2 (M) and {/ w } n eN C C c (M) is an arbitrary 
sequence that converges in L 2 (M) to /. The limit 

^(/)=Um«r(/„)eZ, 2 (R) 

in the L 2 (M)-norm is called the Plancherel transform of /. 

Remark 1.55. The Plancherel transform and the Fourier transform are both defined 
on L l (R) DL 2 (M) and agree there. Indeed, let / E L l (R) fiL 2 (M) and choose a 
sequence {f n }neN C C c (M) that converges in L 1 (M) and L 2 (M) to /. Then 

||Wn)-^(/)||2->0 and ^(f n )(co)-^^(f)(co), for all CD E R and n -> oo. 

Since 5F(f n ) —> ^(/) in L 2 (M), there exists a subsequence {«fc}jfceN C N so 
that ^(f n )(co) —> ^(/)(o)) for almost all co E M. Hence, ^(/) = ^(/) almost 
everywhere. 

Lemma 1.56. Let / E L ! (R) flL 2 (M). 77z^/i 2^(/) = ^(/) almost everywhere. 
Moreover, 

2- / |^(/)(o))| 2 ^= / |/(0l 2 *. 

27T 2m 7m 

Theorem 1.57 (An equivalent description of the Plancherel transform). Let f E 

L 2 (M). Fork >0, letfx :=X[-^]f Then fx EL 1 (M)nL 2 (M) and^(fx) EL 2 (M). 
In particular, 

; lim||^(/)-^(/ A )|| 2 = 0. 

Proof. Exercise! (Hint: Use Lemma 1.56.) □ 

So far we know that the Plancherel transform is a bounded linear operator & : 
L 2 (M) —> L 2 (M). In addition, we know that 

-5=\\&>{f)h = \\f\\2, v / gl 2 ( r ). 
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Hence, £? is injective. To show that £? is also surjective, we prove an inversion 
formula. 

Lemma 1.58 (Parseval). Let /,g e L 2 (M). Then 

(£*(/),g) = {&{g)J)■ 

Proof. Let /,gG C c (M). Fubini’s theorem yields 

f){a)g{co)dm = [ [ f(t)e~ ,cot dtg(co)dco 

Jr Jr Jr 

= / m#{ g ){t)dt = <^( g )j ). 

Jr 

For f,g G L 2 (M), choose convergent sequences {fn}neN,{gn}ne N C C c (M) so that 
f n —* f and g n —> g as ft —> oo in the L 2 (M)-norm. From the continuity of the inner 
product, it follows that 

(^(/),g> = lim {^{fn),g^) = lim (gm) Jn) = (g) J) ■ □ 

njn^oo nm ^oo 

Theorem 1.59 (Inversion formula for the Plancherel transform). Let f G L 2 (M). 
Then 

f = — ^ ^ (/)) almost everywhere. 

2k 


Proof. Set g := ^(/). The linearity of the inner product yields 

2 


1 






1 


2/r 


2^<«> 


By Parseval’s equality, we have that 


and 


('4^) = i <*<*), 7 > = 2i <^,s = ^ IW)ll! = 


as well as 


Thus, 


2^ (S) 


= ^ 11^)111 = 


1 


f ~^(g) 


= 0 , 


which implies the statement in the theorem. □ 

Hence, & : L 2 (M) —> L 2 (M) is a topological automorphism. 
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In summary, we arrive at the following theorem. 

Theorem 1.60 (Plancherel). The linear operator & : L 2 (M) —► L 2 (M) is an isomor¬ 
phism from L 2 (M) onto L 2 (M). It has the following properties. 

1. For all f,g G L 2 (M), one has that 




In particular, 


^=mf)h=\\f\\2. 


2. 0>(f)=&(f)forf€L 1 (R)nL 2 (R). 

3. &(f)(a>) = lim x^^,Pxf{t)e~ im dt in the L?-norm. 

4. f(t) = 2 ^ lim x~>oof\ &(f)((o)e lC0t dm in the L 2 -norm. 

Proof Only item 1 needs to be shown. However, Plancherel’s formula and the 
inversion theorem imply that 


— ~{&>{f),0>{g)) = —-(&(& f (g)),f) = (g,f) = {f,g}- □ 


We introduce the convolution * : L 1 (M) x L p (M) —> L p (M) , 1 < p < <». For / G Z/ (M) 
and g G L l (M), define 



One can show that g * / exists for almost all t G M and that g * / G Z/(M), satisfying 
the estimate 


lk*/||p< IIsIMI/IIp- 


In the special case p := 2, we obtain the next result. 
Theorem 1.61. Let g G L 1 (M) and f G L 2 (M). 77z^/i 


£*/(0 = [ f(t~s)g(s)ds 
JR 


is an element ofL 2 (M) c/;/J estimate 

lk*/||2<||«||l||/||2 

holds. Moreover, the Plancherel transform satisfies 




Proof. Only the last statement needs to be proven. This, however, is left as an 
exercise to the reader. □ 
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Note that if / E L 2 (M), then * / E L 2 (M). In particular, 

Fx *f(t) = F f_ x (l - &(f)((o)e ia>t dco. 

This last equality can be shown as follows. Set <p(o)) := A x (co)e l0)t . Then 

&(<p)(s)= [ A X {co)e- i < s ~ t U (0 = F x {s-t). 
m 

Since (p is continuous and has compact support, &((p) = &((p) holds. Plancherel’s 
theorem then implies 

Fk*fif)= [ F?i(t-s)f(s)ds= [ F k (s-t)f(s)ds 

J \R J M 

= f &>(<p)(s)f(s)ds = j &>(f)((Q)<p((D)d(Q 

Using arguments analogous to those in the proof of Theorem 1.51, one can 
establish the next result. 

Theorem 1.62. Let f E L 2 (M) and let 

Then a\ (/) (t) —> f(t) as A —> for almost all 1 El. 

1.3.3 The Theorem of Paley-Wiener 

In this section, we exhibit the connection between Fourier series and the sampling 
theorem. The link is the theorem of Paley and Wiener. 

Theorem 1.63 ( Paley-Wiener ). Suppose that f E L 2 (M). Then the following are 
equivalent. 

1■ ^(f)\R\[-S,S] = 0. a.e„ for some 8 > 0. 

2. / can be extended to an entire function f : C —> C of exponential type S, i.e., 
|F(z)| <Mexp(5|z|), Vz E C, 
and some constant M > 0. 


Proof See [166, 209]. □ 
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Definition 1.64. An L 2 (M)-function satisfying &(f) |m\[-< 5 5 < 5 ] = 0, a.e., is called 
band-limited. The number 25 is referred to as the bandwidth of /. The space of 
all band-limited functions is called a Paley-Wiener space and is denoted by PW$. 



Fig. 1.5: The isometry of the Fourier transform on the torus T in both the theorem of Paley-Wiener 
and in the sampling theorem generates a commutative diagram. 


1.3.4 Discretization: The Poisson Summation Formula and the 
Sampling Theorem 

Let T > 0. Suppose that / E L l { R) and that the following two conditions are 
satisfied: 

(PI) The series Xr=-°° f(t + 2nT) converges everywhere to a continuous function. 
(P2) The Fourier series Y^=-oo <^(f) i n /2T) e int converges everywhere. 

Under the above conditions, a relationship between Fourier coefficients and the 
Fourier transform can be established. The result is the Poisson summation formula, 
which can be stated in the form 

£ f(t + 2nT) = 2- £ .£•(/) (^) e‘ nl . 

A proof of the Poisson summation formula can be found in, e.g., [151]. 

The next theorem gives conditions on / and its Fourier transform J^(/) under 
which (PI) and (P2) automatically hold. 

Theorem 1.65. Suppose that f is a Lebesgue-measurable function satisfying 

/<,) = ,? (tA) and 

for some a > 1 as \t\ —» °° and |o)| —> °° . Then conditions (PI) and (P2) hold. 
Here, 6 denotes the Landau symbol. 
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Theorem 1.66. Suppose that f E L 2 (M) fiC(R) and that &(f) E L^R). Then the 
following estimate holds for every 8 > 0: 


sup f(t') - £ f(j) sinc(St-n) < ]~ [ \&{f)(co)\d(D. 

tZR n^oo ^8/ K J\(0\>7l8 


Proof See [162]. □ 

Considering band-limited functions in Theorem 1.66 immediately yields the 
Shannon-Whittaker-Kotel’nikov sampling theorem. 

Corollary 1.67 (Shannon-Whittaker-Kotel’nikov sampling theorem). Let f E 

L 2 (M) nC(R) be a band-limited function with bandwidth 2nd, where 5 > 0. Then, 
for every t E R, oo 



Moreover, the series on the right-hand side converges uniformly and absolutely 
on R. 

Defining B to be the smallest positive real number for which 
W)Ir\[ -nB,nB] = a.e., Corollary 1.67 expresses the fact that B is the largest 
sampling rate allowing the exact reconstruction of / in terms of the samples f(n/B ), 
n E Z. This value of B is called the Nyquist rate. 

1.4 Windowed Fourier Transforms 

The Fourier, resp. the Plancherel, transform is not stable with respect to local 
changes in the time or frequency domain. As an example, consider Fig. 1.6, where a 
small local change of the signal spreads over the whole frequency spectrum. This is 
due to the fact the analyzing function family consisting of sine and cosine functions 
is not local, but global. 

In order to obtain a better location of a signal in both the time and frequency 
domains, the ordinary Fourier transform is modified by multiplying the signal / by 
a window function <p. The present section defines such windowed Fourier trans¬ 
forms and discusses the dependence of the filtered signal on the parameters defining 
the window function. Particular emphasis is placed on a specific windowed Fourier 
transform, namely the Gabor transform. 


1.4.1 The Short-Time Fourier Transform (STFT) 

Definition 1.68. Suppose that (p £ L 1 (M) ri L?(W) and / £ L 2 (M). For 0 ) £ M and 
b E R, define 


</>,,(/) M := / f{t)(p(t - b)e~ icot dt = (f,W b(0 ) 
Jr 


(1.14) 


where Wb^ m (t) := e lM (p(t — b). 





32 


Peter Massopust and Brigitte Forster 




Fig. 1.6: (a) is a function / and (b) the modulus of its frequency spectrum with respect to the 
Fourier transform in 0 (R). (c): A perturbation was added to the function /, causing a global 
change in the modulus of its frequency spectrum (d). 

Then &b(f)(co) is called the short-time fourier transform (STFT) of /. 

Holder’s inequality and the Cauchy-Schwarzinequality imply that/• — E 

L 1 (M) DL 2 (M). Thus, the STFT has properties analogous to those of the Fourier and 
Plancherel transforms. 


1.4.2 The Gabor Transform 

Consider the Gaussian 

8s(f) := (L15) 

with the parameter s > 0. Recall that g s (t) = G^(t), where A = 1/2 y/s. With this 
window in the STFT, we get the following special case. 

Definition 1.69. Let / E L 2 (M), and let s > 0 and b E R. Define 

W)(fi>) := [ m gs (t-b)e- ia *dt. (1.16) 

Jr 

Then ^ (/)(©) is called the Gabor transform of / with parameters s and b. 

The Gabor transform is a so-called time-frequency representation, where the 
parameter b models the time variable, i.e., the position of the window g s , and co 
the frequency part. The parameter s describes the “width” of the time window. 
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1 . 4 . 2.1 The Gabor Transform as a Fourier and Plancherel Transform — From 
the View point of the Variable co 

The variable co appears in (1.16) in the exponential term, similar to the usual Fourier 
or Plancherel transform. Since the Gabor transform is localized with the integral 
kernel g s , the variable co can be interpreted as a local frequency and |^(/)(co)| as 
a local amplitude for a signal in a neighborhood of b. In fact, the variable co allows 
for similar properties as the Fourier and the Plancherel transform, as the following 
proposition shows. 

Proposition 1 . 70 . Let f G L 2 (M). Then 

1. ^(f)eC b u (R). 

2. hm^i^oo^ (/)(&>) = 0 ( variation of the Riemann-Lebesgue lemma). 

3. &§(f) e l 2 (R). 

Proof The Holder and Cauchy-Schwarz inequalities imply / - gs(*~b) El 1 (M) fl 
L 2 (M). Thus, 

&b(f) =&(f'8s{*-b)) 

and statements 1 and 2 follow from the properties of the Fourier transform on L 1 (M). 
Moreover, 

&b(f) = ^*(/■&(•“*))> 

together with the fact that the Plancherel transform is an isometry on L 2 (M), yields 
statement 3. □ 


1.4.2.2 The Gabor Transform from the View point of the Variable b : Window 
Translation 

The family of functions (/)(&>) : be M} partitions f)(co ) into a set of Gabor 

transforms. To see this, we first define the modulation operator 

Mo, :L P (R) 

f»Ma>f:=e ia »f, 


for all co e M, and the translation operator 

T b :L p (R) 

/ ^T b f :=/(•- b), 


for all b e M, where in both cases i <p< °°. 

Theorem 1.71. Let f G L 2 (M). Then the following hold: 

1. The mapping f)(co ) : M —> C, b t—► @£(f)(co) is continuous. 

2. e L 2 (M) and \\^(f)(co)\\ 2 < H^IU • ||/|| 2) V S > 0,V® e K. 

3. &mf)((0)) = is in L 2 (M), where ^ r (g s )(co) = e~ sc ° 2 . 
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4. Iff£L l (R)nL 2 {R), then^(f)(co) GL 1 {R)(1L 2 {R) and 

[9§(f)(a>)db = &(f)(a>), V© e R,s > 0. 

J M 

Proof. 1. Follows from the continuity of the function R —> L 2 (R),Z? i—> 7J,g 5 
= g^(» — b) and the dominated convergence theorem. 

2. (/)(©) = -b)dt = g s * (M-a)f)(b), with the convolution * : 

L l (R) x L 2 (M) -» L 2 (M). Thus, 

mm ib < ikjiiiMwib = ii^iiiik-^/ib = ii^iiiii/ib- 

3. ^(SfJ(/)(©)) = ^(g s )^(M_ co f) = &(g s )&{M- m f). 

4. ^(/)(m) = g s *M- (0 f G L 1 (R), since g^ and / are in L 1 (R). Hence, 

^(Sf;(/)(©))(p) = &(gs)(p)&(M-a>f)(p), Vp G R. 

Setting p = 0 yields the claim. □ 


1.4.2.3 The Window Parameter s and Time-Frequency Localization 

The Gabor transform is a special case of the localized STFT. A measure for the 
width of the window is given by the standard deviation of the “density” g 2 : 



The quantity A gs is called the radius of g s . It is easy to verify that A gs = y/s. (Show 
this!) 

Theorem 1.72. Let f G L 2 (M) and let b, co G R. Then 

mm=^e-^hnm-b). 

Proof. Exercise! □ 

Remark 1.73. ^(^(/)) (— b) localizes in the frequency domain with radius 1 / s/4s, 
whereas (/)(&)) localizes in the time domain with radius y/s. 

A measure for the simultaneous localization in the time and frequency domains 
is given by y/s • 1 / s/4s =1/2. 

The time-frequency window is thus [b — %/s,b -j- y/s] x [co — (1/2 y/s), 
co + (1/2 yT)]. 

Theorem 1.74. Let f,g G L 2 (R). Then 

f f mmWismdbdco=^=(f, g ). 

J IR. J IK. 2^S 


Proof. Exercise! □ 
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Theorem 1.75 (Inversion theorem). Assume that f G L 1 (M) D L 2 (M). At all points 
t gM at which f is continuous, the following inversion formula holds: 

m= J^ff m f)(co)gs(t-b)e ia>t dbdco. 

V 7l Jr Jr 

Proof Let {^a}ag( 0 ,oo) be the Fejer kernel on M. Then F x G L l (R) DL 2 (M). 
Approximating / with the Fejer kernel yields 

t fM)= f f{t)F x (t-a)dt= f f(t)F k (a-t)dt = f*F k (a). 

Jr Jr 

This equation holds almost everywhere and pointwise at all points of continuity 
of /. At these points, we also have pointwise convergence: 

\\mjJ,T a F x )=f(a). (1.17) 


Moreover, 

%{T a F x )(co) = [ F x (t — a)g s (t — b)e~ i(0t dt 

JR 

= e~ icob [ F x {t-a)g s (t-b)e- i( °^- b Ut 
JR 

= e~ icob l F x (y + b-a)g s (y)e- io >ydy 
Jr 

= e~ icob [ F x (a — b—y)g s (y)e~ icoy dy 
Jr 

= e- imb {F x *M- mgs ){a-b) 

—> e~ l(0b M- ags (a~b)= g s (a-b)e~ l(0a asA->°°. (1.18) 

Now apply Theorem 1.72 to / and F x (9 — a) and use the dominated convergence 
theorem to obtain 

lim [ [ f)(co)^(T a F x )(co)dbdco= [ [ ^(f)(co)g s (a-b)e iC0a dbdco. 
Jr Jr Jr Jr 

On the other hand, by Theorem 1.74 and Eq. (1.17), 

[ [ &§(f)(co)#§(T a F x )(co)dbdG> 

JR JR 

forA-oo, 

at all points a , where / is continuous. This gives the result. □ 

For an illustration of the dependence of the Gabor transform on the parameter s, 
see Figs. 1.7 and 1.8. 
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Fig. 1.7: (a) The function / and (b) the GauB kernels for different values of the window 

parameter 5 . Figure 1.8 depicts the corresponding Gabor transforms. 


1.4.3 The Heisenberg Uncertainty Principle 


In the context of localizating the Fourier transform, immediately the following 
question arises: Is it possible to choose a window function 9 for the STFT whose 
energy is well localized in both time and frequency? Unfortunately, it is impossible 
to tell which frequencies are present at a specific point in time. This is the content of 
the so-called uncertainty principle, which states that a function f and its Plancherel 
transform &(f) cannot both have arbitrarily small support. 


Definition 1.76. Assume that 9 G L 2 (M) and that \/TH * |<p| £ L 2 (R). 

1 . a* := • \(p(t)\ 2 dt is called the center of 9 . 

2 . A(p (f R (t~a*) 2 \<p(t)\ 2 dt) 1/2 is called the radius of 9 , occasionally also 

the mean bandwidth or the mean running time. 


Theorem 1.77 (The Heisenberg uncertainty relation). Let f G L 2 (R). Then 


Tft 2 \m\ 2 dt.f(o 2 \nf)m 2 d(o>\wn\i (i.i9) 


(We allow the left-hand side to assume the value “<*>”) The left-hand side equals the 
right-hand side iff f(t) = c • e~ ktl for k > 0 and c G C. 

Proof. We refer to [46] for the proof. □ 


The Heisenberg uncertainty relation was first proved in this form but under 
stronger assumptions by H. Weyl. 

For the proof of Theorem 1.77, one requires the concept of Schwartz space. The 
Schwartz space 5? consists of all functions / G C°° (M) that satisfy the condition 

sup sup(l +t 2 ) N \D a {f)(t)\ < 00 (1.20) 

\a\<NteR 


for every N G Nq. 
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- 2-10 1 2 
b 


s = 0.5 

Fig. 1.8 : Gabor transform for the functions in Fig. 1.7 for different values of the window 
parameter s. For small s, e.g., s = 0.01, the window is well localized in the time domain, 
and the edges of the function / appear nicely (horizontal axis), whereas the Fourier spectrum 
(vertical axis) is blurred. For large s, e.g., s = 0.5, on the contrary, the sine character of &(j) is 
well recovered, but the time-domain representation of / is blurred. 
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In other words, p • D a f is a bounded function on R for every polynomial p 

and every differential operator D a of order a € No. Since this is also true for 

(1 + t 2 ) N p(t) instead of p(t), it follows that every p • D a f G L l (R). For, 

\p(t)-D a f{t)\ < const. • 

Therefore, functions in 5? decay faster than any power l/\t\ m as \t\ —> ©o. 

Functions with this property form a vector space 5? for which the countably 

many norms (1.20) generate a locally convex topology. 

Theorem 1.78. 1. 5? is a Frechet space, i.e., a complete, locally convex, metrizable 
space. 

2. Suppose that p is a polynomial, g G SF, and a G N. Then each of the following 
three maps is continuous and linear from SF —> SF: 

f^P'f, f^D a f. 

3. The Fourier transform is a continuous linear mapping from SF —> 5?. In partic¬ 
ular, J^(/) G SF for f G SF . 

4. SF is dense in L 2 (R). 

Proof See, for instance, [209]. □ 


1.4.4 Discretization: Gabor Frames 

Up to now, the Gabor transform has been an integral transform operator with 
continuous parameters and variables s, b , and co. This family of parameters yields a 
highly redundant system; therefore, the question arises if there exists a discretized 
representation in the form of a series instead of an integral. Various aspects of such 
a discretization are considered in Chapters 2, 3, and 5. We refer to these chapters for 
the ideas and details. It is, however, worth noticing that a discretization to a Gabor 
Riesz basis with fast decaying basis elements in the time domain as well as in the 
frequency domain is not possible. This is the famous Balian-Low theorem, which 
is stated in Chapter 2 (Theorem 2.27). 


1.4.5 Shortcomings of the Windowed Fourier Transform 


It follows from the definition (1.14) of the windowed Fourier transform that at each 
point in the time-frequency domain, a window is translated to the time location 
and frequency location under consideration. The duration and the bandwidth of the 
window and thus the resolution do not change. The resolution in the time and 
frequency domains depends only on the form of the window and, by the Heisenberg 
uncertainty relation, it is not possible to achieve an optimal resolution simultane¬ 
ously in the time and frequency domains. For a signal that contains both low- and 
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high-frequency components, however, it would be desirable to obtain a good 
resolution in the low-frequency spectrum since there small changes are relevant, 
whereas in the high-frequency spectrum, a good time resolution is more important 
since a complete oscillation requires less time, causing a faster change of the instan¬ 
taneous frequency. Thus, at low frequencies one needs a good frequency resolution, 
taking into account a bad time resolution, whereas at high frequencies a good time 
resolution but a bad frequency resolution is desired. The STFT does not provide a 
means of achieving this. 

A generalization of the STFT is provided by the wavelet transform. Instead of 
comparing a signal with a window function that is translated and modulated, the 
wavelet transform compares a signal with a window function that is translated and 
scaled. The scaling induces, analogous to modulation, a frequency shift. However, 
a frequency increase causes a simultaneous reduction of the time duration. At high 
frequencies this results in a better time resolution and at low frequencies in a better 
frequency resolution but a worse time resolution. 

1.5 The Wavelet Transform 

The idea of the wavelet transform consists of comparing the analyzed signal or 
image with one single pattern, the wavelet. The pattern is dilated and translated, such 
that its shape works like a looking glass, which is moved (translated) over a signal 
or an image in various distances (dilations) from the signal or image. The wavelet is 
a function that is well localized in time or space as well as in the frequency domain. 
Therefore, in contrast to the Fourier transform, it allows for a local analysis. 

1.5.1 Definition and Properties 

Definition 1.79. A function y/ G L 2 (M) is called a wavelet if it fulfills the admissi¬ 
bility condition'. 



The wavelet transform of a function / G L 2 (M) with respect to the wavelet i {/ is 
defined as 



for all a G M \ {0} and all b G M. 

Let y/ G L l (R) be a wavelet. Then &{\]/) = &(yr), and ^(y/) is continuous. 
The admissibility condition therefore implies ^(y/)(0) = 0, which is equivalent to 
y/ having zero mean: 



This is a necessary condition for wavelets in L 1 (M) fl L 2 (M). 
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Example 1.80. 1. The generator of the Haar system from Section 1.1.2.3 is a 
wavelet, the so-called Haar wavelet: 


V fit) = < 


1 , 

- 1 , 

0 , 


for 0 < t < j, 
for j < t < 1 , 
else. 


For the Fourier transform, ^(y/)(co) = ie~ l % sin( co /4)(sin j/ j). The admissi¬ 
bility constant is cy — 2 In 2 . 

2. A C°°-example for a wavelet is the Mexican hat wavelet : 

i//(v) = --T^e~ xl t 2 = (1 -x 2 )e~^^ 2 . 


The Fourier transform has the form 

= ©V“ 2/2 ; 


the admissibility constant cy = 1. 

Lemma 1.81. 77z£ set of wavelets {y/ G L 2 (M) | y/ is admissible} is dense in L 2 (M). 
Proof. Exercise! □ 

The wavelet transform operates as a linear time-invariant filter. To see this, we 
consider the Plancherel transform with respect to the second variable: 

&>(w v f( a ,•))((») = v / R-^^(v)(-«®)^(/)(®) 

V C V' 

= VW\—''& , (w){aa>)& , (J){a>) in L 2 (R). 


From this we can deduce that the wavelet transform is an isometry: 

Theorem 1.82. The wavelet transform corresponding to the wavelet y/ is an 
isometry: 


: A 


dadb\ 
’^ 2_ )' 


Proof. By definition of the wavelet transform, for / G L 2 


W ¥ f(a,b) 


— - 7 = / ) 




1 1 



Wyf(a,b) is well defined, because y/ G L 2 ( 
L 2 (M). By the Parseval inequality, 


and therefore also y/(m — b/a) G 


dad/? 
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Jr Jr a 

Jr Jr ^/cy 


2k 
1 1 


dadco 


= wzf! [ \a\\nY)(-aco)\ 2 \&(f)(co )\ 2 

2k Cy/ Jr Jr 

= h~ f / l ^ ( M W|2 \&(f){<o)\ 2 dsd<o 
2k c w Jr Jr Ls 


a L 

dadco 
a 2 


= ^0)\\l m = \\f\\l H W 

Theorem 1.83 (Inverse wavelet transform). The adjoint operator 


W*: L 2 


dadb\ 


-II- 

r c~d Jr Jr \/| 




VW\ 


¥ 


t-b 


g(a,b) 


dadb 


( 1 . 21 ) 


is the inverse operator of the wavelet transform on the image Wy(L z 


In fact, for all / G L 2 (M) and all g G L 2 


1 ,dadb)c 


Wrf’8) L 2( R 2,^) = jj^fiatfg^b) 


dadb 


= 11-^1 mf=w(—)g(a,b)dt 
JrJr^/c^Jr yj\ a \ \ a J 


dadb 


a 


a* 


dadb 


-dt 


= (f,Ks)- 


a^ 


The parameter b shifts the pattern, the wavelet i/a, over the signal. The parameter 
a scales the wavelet and therefore adapts its shape. If the scaled wavelet at a certain 
place b has a similar local shape as the analyzed signal, then the wavelet coefficient 
has a large absolute value. Small \a\ describe small details, while larger \a\ generate 
approximations of the analyzed function in a neighborhood of the point b. In fact, 
the wavelet transform operates as a filter: 


W v f(ci,b) = 


1 1 


I /(*)¥ 

Jr 


t — b 


dt = 




(P-a¥* f)(b). 


For fixed a, this corresponds to filtering with y/(» /a) at the point b. i// G L 1 (M) n 
L 2 (M) is a bandpass filter, because due to the admissibility condition, J^(i/r)(0) = 0 
and by the Riemann-Lebesgue lemma, limy^^ ^(\j/)(co) = 0. 
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1.5.2 Scale Discretization—The Dyadic Wavelet Transform 

To discretize the wavelet transform W ¥ , we will have to discretize both parame¬ 
ters a and b. To this end, we consider the affine operators D a and 7 \ for dilations 
and translations, respectively. Obviously, W ¥ f(a,b) = (l/yT^X/, TijD a \\f) L 2 ^y 
Therefore, the question arises, if there exists a discrete set of wavelets {TbD a \j/ \ 
(a,b) in a discrete set 1} such that the inversion formula still holds; i.e., no infor¬ 
mation about the analyzed function / is lost by considering the discrete set of 
coefficients {W ¥ f(a,b) \ (a,b) el}. 

The wavelet transform is translation-invariant. Let t G R. Then 

W w (T T f)(a,b) = W w (f{» - t)){a,b) = W v (f)(a,b - t). 

The idea of this section is to discretize the scale parameter a while keeping the 
translation invariance of the wavelet transform. 

Definition 1.84. Let / e L 2 (M). Its dyadic wavelet transform is defined as 

w " f{lJ ' b) = L fi,)v C^r) *’ J e z ' 

Theorem 1.85. Suppose there exist two positive constants A and B such that 
A<^\^(y)(2 j co)\ 2 <B, VcogR. 

jeJj 

Then 

A||/|| 2 < I fjW v f(2\.)\\ 2 <B\\f\\ 2 
jez z 

with respect to the L 2 (R)-norm. 

Proof. Consider the Plancherel transform 

•))(<») = V2l-T-Wm2ho)^(f)(co). 

V c v 

Summation together with the assumption gives 

A W)(fl>)I 2 < I ^(*V/(2',.))(a>)| 2 <B\&(f)(co)\ 2 , 

jeZ z 

for all (O e M. Integration, the theorem of dominated convergence, and the 
Parseval equation yield the claim. □ 

The theorem shows that the normalized dyadic wavelet transform 


has the same properties as a frame. 
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We have seen early above that the discretization of the scale parameter a yields 
the dyadic wavelet transform that behaves similarly to a frame. In this section, we 
discretize the translation parameter b. Our aim is to obtain a wavelet basis of the 

form ' „ (<- 


A classical approach to this end is multiresolution analysis. 

Definition 1.86. A sequence of nonempty closed subspaces {Vj}j e z of 
called a multiresolution analysis if all of the following properties are satisfied: 

1. For all j G Z, V )+1 C Vj\ i.e., the spaces are nested. 

2. The spaces are translation-invariant: For all (j,k) G Z 2 , 

/ G Vj ^ /(• -24)GV 7 -. 

3. Scaling or refinement relation: For all j G Z, 

fevj <=* /(»/ 2 ) e v j+ \. 

4. The subspaces span L 2 (M) and separate the space: 


is 


lim Vj = cl L )V j )=L 1 


J \j€ Z / 

li m V/ = Pi V/ = {0}- 

5. There is a so-called scaling function (p G T 2 (M), such that the family 

{(p^-n)} ne z 


forms a Riesz basis of Vo. 

Because of the scaling relation, multiresolution analyses of this form with 
dilation 2 are sometimes called dyadic multiresolution analyses. Since the spaces 
{Vj}jez are nested, the approximation at scale 2 -7 contains all information of the 
coarser scale 2 - * 7-1 . 

There is practical criterion to check, whether a function fulfills the Riesz 
property 5. 

Proposition 1.87. The following are equivalent: 

1. The family {<p(« — n)} n(E z is a Riesz basis of Vo = cl{span{<p(9 — n)} n<E z}. 

2. There are constants A,B > 0 such that for all (0 G [—7T, k \, 


4 < S \^{(p)((o-2kn)\ 2 

D kez 
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Proof. E.g., [244, Prop. 2.8]. 

Example 1.88. 1. The piecewise-constant functions generate multiresolution 
analyses via the nested spaces 

Vj = {g G L 2 (R) | g\[2in,2i{n+\)) = const - a - e --« G Z l' 

The scaling function is <p = X[o,i)- 

2. In an analogous manner, the piecewise polynomial functions generate multi¬ 
resolution analyses [244]. 

Now the questions arises of how the wavelets come into play. In fact, they form 
a Riesz basis for the complementary spaces Wj+ \ of Vj+ \ in Vy 

Vj+ 1 0 Wj+ 1 = Vj, j G Z. 

Properties 1 and 4 of Definition 1.86 give the decomposition 


L 2 (R) = 0W/. 

jGZ 

A wavelet can be generated from a scaling function in various ways. A common 
approach is to consider 


ty(f) — 21 {—l) na n(p{2t 1 ), 

ne Z 


where a n = f R <p(^)(p(t — n)dt. Thus, the wavelet can be calculated directly from 
the scaling function. 

The ideas behind this construction, as well as many other possibilities, are 
discussed, e.g., in [60, 170, 175, 244] and many other books on wavelets. 

There are many families of scaling function/wavelet pairs, e.g., B-splines are 
among them. The Haar wavelet together with its scaling function (p = X[o,i) belongs 
to this class. A good introduction on spline wavelets and others and their approxi¬ 
mation properties can be found in [226, 244]. 

Remark 1.89. There exist wavelet bases of L 2 (M) that are not associated with any 
multiresolution analysis. An example are so-called unimodular wavelets, whose 
Fourier transforms are characteristic functions of certain sets. Such sets are called 
wavelet sets. More about unimodular wavelets and wavelet sets can be found in the 
list of references in [244, Sect. 3.4]. For interesting connections between composite 
wavelets as discussed in Chapter 3, wavelet sets, and reflection groups, we refer the 
reader to [157-159] 

Mallat made the important discovery that there is a fast algorithm for the 
wavelet transform. In his article [174], his proposed pyramidal algorithm based on 
convolutions brought the breakthrough for the applicability of the wavelet transform 
theory. 
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In this section, we briefly consider the problem of extending wavelets to higher 
dimensions. Extensions via the tensor product are always possible, but from a 
modeling viewpoint are not very desirable; images and data would have to be 
assumed to be homogeneous and to possess a lattice structure. To overcome these 
impediments, several new types of wavelets have been introduced, some with very 
specific applications in mind. 

After introducing the tensor product approach for the construction of wavelet 
bases in M 2 , a simple case that nevertheless reflects the main ideas, we list some 
examples of multiscale transforms and give references to the literature. 


1.6.1 Tensor Product Wavelets in 2D 


To employ wavelets for image analyses in 2D and 3D, one can consider the tensor 
product of ID wavelets. To this end, suppose that </) is a given (one-dimensional) 
scaling function for an MRA on L 2 (M) and iff the associated wavelet. Define 


®j\ k\ \j 2 k 2 i x ^y) — ( 0/1 k\ ® 072&2 — 071 k\ ( x ) 0 / 2&2 (y ) ’ 


where (j)jk := 2 i '/ 2 0(2- 7 ’ • — k), j,k G Z. It can be shown that the scaling functions 
{®jk x -jk 2 \j e Z; (ki,k 2 ) GZxZ} are basis functions for approximation spaces 
CL 2 (M 2 ) by setting 


*j = vp 




7 ’ 


j £ ^5 


where © denotes the tensor product of the vector spaces Vj. It should be clear that if 
the spaces Vj form an MRA of L 2 (M), then the spaces form an MRA of L 2 (M 2 ). 
Now 


9Jy+i = Vj +1 0 Vj + 1 = (Vj 0 Wj) © (Vj 0 Wj) 

= Vj © Vj 0 (Vj © Wj 0 Wj © Vj 0 Wj © W/) 


where we set 


2»7 = (V 7 © W/) 0 (W/ © V}) 0 (Wj © W/) . 

horizontal vertical diagonal 

Hence, in the two-dimensional setting, there are three wavelets: 

yb (x, y) = (0 © i/r) (x,y) = 0 (x) i/r(y), (horizontal) 

^ v (x,y) = (i/a© 0 )(x,y) = \jf(x)^(y), (vertical) 

^(x,y) = (i/a© i/r)(x,y) = if/(x)if/(y), (diagonal) 
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if x is taken along the horizontal and y the vertical direction. Moreover, 


jk2 \ (ku k 2 )eZ 2 -^=h,v, or d} 


is a Riesz basis for Wj, and function / G L 2 (M 2 ) can be written 


as 


/(•*oO = X X < c k l k2’' I 'jkvjk2 )» 


where «P = (^jk uj k 2 ^jk v jk 2 ^fk v jk 2 ) T and the c iU = c iU(/) are vector 
coefficients. 

A similar construction in 3D yields eight wavelets oriented along the face and 
space diagonals of the unit 3D cube. 


Remark 1.90. To reduce the number of wavelets, dilation matrices A other than the 
dyadic 21 can also be used. The number of wavelets depends on the 
number of cosets, which is specified by the determinant of A. For example, if in 2D 
A = 2/, then | detA| = | det2/| = 4 and there are | detA| — 1 = 3 wavelets, as we saw 


above. The matrix 


1 1 
-1 1 


generates the so-called quincunx grid in 2D, and every 


multiresolution analysis based on this dilation matrix needs only one wavelet 
to generate the orthogonal complements Wj. For more interesting facts on the 
quincunx lattice, its dilation matrices in 2D, and extensions to higher dimensions, 
we refer to [153, 170, 227] and the references therein. 


1.6.2 Some Wavelet-Type Transforms 

Although ID wavelets have good resolution in both the time and frequency domains 
they lack the ability to resolve signals along arbitrary directions in 2D and 3D. In 
addition, a large number of wavelet coefficients are required to account for edges, 
i.e., for singularities along lines or curves. In order to retain the multiscale struc¬ 
ture of a signal decomposition, the idea of wavelet transform had to be extended to 
incorporate the resolution of singularities along lines or curves. 

Among these extensions are the following transforms, which are briefly 
described for image decompositions and analysis. 

• Ridgelet transform: The idea is to choose basis functions that are constant along 
lines, i.e., ridges, and that transverse to these ridges are wavelets in the regular 
sense. Details of this construction can be found in, for instance, [39, 67]. 

• Curvelet transform: Here an image is analyzed using different block sizes but 
employing only a single transform. The image is first decomposed into a set 
of wavelet bands and then each band is analyzed using a ridgelet transform. 
The block size can be changed at each scale level. As references, we mention 
[40-42] 



1 Introduction: Mathematical Aspects of Time-Frequency Analysis 


47 


• Beamlets: Unlike wavelets which offer a localized scale and position 
representation near fixed regions in an image with a specified scale and location, 
beamlets offer a localized scale, position, and orientation based on dyadically 
organized line segments. For more details, the reader is referred to [71]. 

• Wedgelets: Wedgelet approximations were introduced in [70] as a means to 
efficiently approximate piecewise-constant images. Generally speaking, wedgelet 
approximations are obtained by adaptively partitioning the image domain into 
disjoint sets and by computing an approximation of the image on each of these 
sets. Optimal approximations are defined using a certain functional that weighs 
the approximation error against the complexity of the decomposition. As a refer¬ 
ence for an application, see [99]. 

• Platelets: The image partition is based on recursive, dyadic squares allowing 
wedge-shaped final nodes (instead of squares). Like wedgelets, platelets approx¬ 
imate with piecewise-constant functions. They are suited for the approximation 
of images consisting of smooth regions separated by smooth contours. For more 
details and an application, see [243]. 

• Framelets: Here, the idea of redundent representations and frames is employed 
to construct redundant wavelet systems. The interested reader is referred to [61] 
for a construction and more details. 

• Shearlets: Shearlets are an affine system with a single generator parameterized 
by scaling, shear, and translation parameters. The shear parameter captures the 
direction of singularities, and the shearlet transform can be regarded as matrix 
coefficients of a unitary representation of a special affine group. In addition, there 
exists a natural MRA structure associated with the systems. For the construction 
and a discussion of shearlets, we refer the reader to Chapter 3. 


1.6.3 Moving to Other Manifolds—Wavelets on the Sphere 

For applications that deal with one- or two-dimensional signals of finite duration 
or with data that are distributed on spherical surfaces and that require a multiscale 
approach, the notion of a wavelet has to be extended to encompass the underlying 
geometry of these applications. 

One way of considering wavelets on compact intervals is via periodization, which 
corresponds to constructing wavelets on the circle S 1 . (See, for instance, [60].) 
There exist several methods of constructing wavelets on compact intervals. One such 
method adds boundary functions to the collection of wavelets whose support lies in 
the interior of the interval in order to preserve the orthogonality conditions [53]. 
Another approach is via multiwavelets based on fractal functions, where such 
boundary functions are not necessary. This latter approach was also extended to 
higher-dimensional settings. The interested reader may consult [72, 73, 103, 132, 
181, 182] as references. 

Extending wavelets to the sphere § 2 is not trivial. One reason is the nonexis¬ 
tence of a homogeneous dilation operator on § 2 . In addition, notions such as the 


48 


Peter Massopust and Brigitte Forster 


Fourier transform that were introduced earlier in this chapter need to be transferred 
to § 2 . The mathematical details of harmonic analysis on the sphere and also on other 
manifolds can be found in [93] and [136]. 

In Chapter 4, a construction of wavelets on § 2 is presented and applications to 
astrophysics and neuroscience are considered. 
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1. Prove Theorem 1.8. 

2. Prove Theorem 1.16. 

3. Verify the claim made in Example 1.21. 

4. Prove Theorem 1.26. 

5. Prove Theorem 1.28. 

6. Show that (L 1 (T), *) is a commutative Banach algebra without unity and that 
the Fourier transform is a Banach algebra homomorphism from L 1 (T) —> l°° (Z). 

7. Verify the claim made in Remark 1.33. 

8. Prove Theorem 1.34. 

9. Complete the proof of Corollary 1.50. 

10. Prove Theorem 1.57. 

11. Assume that / E L 2 (M) and g E L l (M). Show that &(g */) = ^(g)^(f). 

12. Prove Theorem 1.62. 

13. Let g s be given as in (1.15). Show that A gs = y/s. 

14. Prove Theorem 1.72. 

15. Prove Theorem 1.74. 

16. Prove Lemma 1.81. Hint: For / E L 2 (M) and £ > 0, consider ^(/) • Xr\(-£,e)- 

17. Show that the integral in (1.21) defines an element of L 2 (M). 

18. Verify the following equations for the dilation and translation operators 
introduced in Section 1.5.2, D a and 7J,, respectively. 

a. For the adjoint operators, (D a )* =Di/ a and (7^)* = 

b. W ¥ (T B D A f)(a,b) =W Tb D a¥ f(l/A,-B/A) = W ¥ f(a/A,(b-B)/A). 

19. Verify Example 1.88. 

20. Show that the nested spaces 

Vj = {g e l 2 (R) | supp &(g) c [-2~ J n,2~ J n]}, j e Z, 

form a multiresolution analysis of L 2 (M). 

Hint: Use cp(t) = (sin nt)/nt as the scaling function and apply the Shannon 
sampling theorem. 
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Abstract B-splines are some of the most versatile functions in applied mathematics. 
The purpose of this chapter is to present the theory of frames in Hilbert spaces with 
a direct focus on B-spline generators. 


2.1 Introduction 


Frames provide a natural way of expanding functions in separable Hilbert spaces: 
They are more general than orthonormal bases and yield more flexibility. In this 
chapter we give a short presentation of general frame theory, as well as an 
introduction to frames in L 2 (M) having Gabor structure or wavelet structure. The 
main body of the chapter concerns explicit frame constructions based on 
B-splines. 

The content can naturally be split into two parts: an introduction to frames in 
general Hilbert spaces, and concrete constructions in L 2 (M). The two parts are tied 
together by Section 2.7, where the B-splines are introduced. 

We begin in Section 2.2 by considering the so-called Bessel condition: It is 
a technical condition implying that all the series expansions considered in this 
chapter converge unconditionally. Section 2.3 reminds the reader about bases, in par¬ 
ticular, orthonormal bases, in Hilbert spaces; the important case of a Riesz basis is 
discussed in Section 2.4. Section 2.5 introduces frames and their central properties. 
Section 2.6 relates frames and Riesz bases; in particular, it turns out that all Riesz 
bases are frames. 

Section 2.7 marks the beginning of the second part of the chapters, where we 
focus on concrete constructions. Most of these constructions are based on B-splines, 
so Section 2.7 gives a short presentation on their key properties. Section 2.8 
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deals with the basic properties of systems of functions formed by translates of a 
single function. Section 2.9 introduces Gabor systems and their frame properties, 
and Section 2.10 focuses on the case of tight frames. Section 2.11 states the main 
results from the theory for dual frames associated with a given Gabor frame; in 
Section 2.12 these results are used to construct explicitly given dual pairs of Gabor 
frames. Finally, Section 2.13 deals with wavelet frames generated by B-splines, in 
particular, the constructions obtained via the unitary extension principle due to Ron 
and Shen. 


2.2 Bessel Sequences in Hilbert Spaces 

The ultimate goal of the present chapter is to obtain series expansions in infinite¬ 
dimensional vector spaces. The purpose of the current section is to introduce a 
condition that ensures that the relevant infinite series actually converge. 

Let M* be a separable Hilbert space, with the inner product (•, •) chosen to be 
linear in the first entry. When speaking about a sequence {fk}^=i i n we mean 
an ordered set, i.e., 

{.fkYk=\ = {fufi,---}- 

That we have chosen to index the sequence by the natural numbers is just for 
convenience: Soon, we will see that all results in this section (and all subsequent 
results based on the Bessel condition) hold with arbitrary countable index sets and 
the elements /* ordered in an arbitrary way. 

We begin with a technical lemma. 

Lemma 2.1. Let {fk}^=i be a sequence in and suppose that XiT=i c kfk I s 
convergent for all {qJ ^ =1 G ^ 2 (N). Then 

T : f(N) - T{c k } k=l := £ c k f k (2.1) 

k= 1 

defines a hounded linear operator. The adjoint operator is given by 

N), T*f = {(fJ k )} k=l - (2-2) 

Furthermore, 

£|</,A)l 2 <imi 2 ll/ll 2 , V/ejr. (2.3) 

k= 1 

Proof Consider the sequence of bounded linear operators 

T n : ^ 2 (N) -► T n {c k } k=1 := £ c k f k . 

k= 1 

Clearly, T n —> T pointwise as n —> oo, so by the principle of uniform boundedness, 
the map T defines a bounded linear operator. In order to find the expression for T*, 
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let / G and {c k } k=1 G £ 2 (N). Then 

(f,T{c k }k =l )jr = (f,i c kfk) =Y,(fJk)ck- (2.4) 

\ k=l / # k= 1 

When T : ^ 2 (N) —> Jrf? is bounded, we know that T* is a bounded operator from 
to ^ 2 (N). Therefore, the kth-coordinate function is bounded from to C; by 
Rieszs’ representation theorem, T* therefore has the form 

T*f={{f,g k )}l:l 

for some {gk} k =i i n By the definition of T*, (2.4) now shows that 

X(. f,gk)ck = X (. fJk)ci , V{q}^ =1 e ^ 2 (N), / e Jf. 

k—\ k= 1 

It follows from here that gk = fk- 

The adjoint of a bounded operator T is itself bounded, and ||T|| = ||T*||. Under 
the assumption in Lemma 2.1, we therefore have 

\\T*f\\ 2 <\\T\\ 2 \\f\\ 2 , VfGJf, 


which leads to (2.3). □ 

Sequences {f k } k=l for which an inequality of the type (2.3) holds will play a 
crucial role in the sequel. 

Definition 2.2. A sequence {fk} k =i i n ^ * s called a Bessel sequence if there exists 
a constant B > 0 such that 

Xl(/,/*>l 2 < B ll/ll 2 , v/ejr. (2.5) 

k= 1 

Any number B satisfying (2.5) is called a Bessel bound for {f k } k=1 . The optimal 
bound for a given Bessel sequence {fk} k= \ is the smallest possible value of B > 0 
satisfying (2.5). Except for the case f k = 0, \/k G N, the optimal bound always exists. 
We will now present a useful characterization of Bessel sequences. 

Theorem 2.3. Let {fk} k= \ be a sequence in and B > 0 be given. Then {fk} k= \ 
is a Bessel sequence with Bessel bound B if and only if 

T : {c,}r =1 »—► X c k fk 

k=i 

defines a bounded operator from £ 2 (N) into and ||r|| < \[B. 

Proof First, assume that {fk} k= \ is a Bessel sequence with Bessel bound B. Let 
{c k } k=l G ^ 2 (N). First, we want to show that T{c k } k=1 is well defined, i.e., that 
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Yk=i c kfk is convergent. Consider n,m G N,w > m. Then 


n m 

^ c kfk — ^ c kfk 
k= 1 &=1 




I 

k=m +1 


Ckfk 


It follows that 


^ ^ c kfk 

k= 1 &=1 


= sup 

11^11 = 1 


^ , g 

\k=m +1 / 


< sup y k Jfc (A,s')| 

||g|| = l&=ra+l 

< ( X M 2 ^ sup ( £ IC4,s)l 2 

\&=ra+l / 11^11 = 1 \&=m+l 

<v*( i ki 2 ) 1/2 . 

\&=m+l / 


1/2 


Since {c^}^ =1 G ^ 2 (N), we know that {£J* =1 |c&| 2 }J =1 is a Cauchy sequence in 
C. The above calculation now shows that fZk=i c kfk}™ = \ i s a Cauchy sequence 
in and therefore convergent. Thus, T {c^}^ =1 is well defined. Clearly, T is linear; 
since ||T{c^}^ =1 || = sup|| g || =1 |(T{c^}^ =1 ,g)|, a calculation as above shows that T 
is bounded and that ||r|| < Vb. 

For the opposite implication, suppose that T defines a bounded operator with 
||r|| < VB. Then Lemma 2.1 shows that {fk}°f=\ is a Bessel sequence with Bessel 
bound B. □ 


The Bessel condition (2.5) remains the same regardless of how the elements 
{fk)k=\ are numbered. This leads to a very important consequence of Theorem 2.3: 

Corollary 2.4. If {A}^ =1 is a Bessel sequence in then Yfk=\ c kfk converges 
unconditionally for all {c^}^ =1 G £ 2 (N). 

Thus, a reordering of the elements in {fk}f=\ will not affect the series Yk=i c kfk 
when {c*}~ =1 is reordered the same way: The series will converge toward the same 
element as before. For this reason we can choose an arbitrary indexing of the ele¬ 
ments in the Bessel sequence; in particular, it is not a restriction that we present all 
results with the natural numbers as the index set. As we will see in the sequel, all 
orthonormal bases, Riesz bases, and frames are Bessel sequences. 


2.3 General Bases and Orthonormal Bases 

Before we introduce the frame concept in Section 2.5, we shortly remind the reader 
about bases in Hilbert spaces. In particular, we will discuss orthonormal bases. 
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Orthonormal bases are widely used in mathematics as well as in physics, signal 
processing, and many other areas where one needs to represent functions in terms 
of “elementary building blocks.” 

Definition 2.5. Consider a sequence {e k }^ =1 of vectors in a Hilbert space . 

1. The sequence {e k } k=1 is a (Schauder) basis for if for each / E there exist 

unique scalar coefficients {c k {f)} k= i such that 


f=Y J c k {f)e k . (2.6) 

k= 1 


2. A basis {e k }°f =l * s an unconditional basis if the series (2.6) converges uncondi¬ 
tionally for each / G 

3. A basis {e k }^ =l is an orthonormal basis if {e k }°f =l is an orthonormal system, i.e., 
if 


{^ki ^j) &kj 


1 if k = j, 
0 if k^j. 


The next well-known theorem gives equivalent conditions for an orthonormal 
system {e k } k=1 to be an orthonormal basis. 

Theorem 2.6. For an orthonormal system {e k } k=v the following are equivalent: 

1. {e k } k= i is an orthonormal basis. 

2. f = lk =1 (f,e k )e k ,VfeJf. 

3. (f,g)=lk =1 (f,e k )(e k ,g),Vf,gejr. 

4. ir =1 \(M\ 2 = ii/ii 2 , v/ejf. 

5. span{^}^ =1 = Jff. 

6. If (/, ef) = 0, Vk G N, then f = 0. 


The equality in item 4 is called Parseval’s equation ; in particular, it shows that 
an orthonormal system {e k } k=1 is a Bessel sequence. Via Corollary 2.4, we obtain 
the following important consequence of Theorem 2.6: 

Corollary 2.7. If {e k }^ =l is an orthonormal basis, then each f G J4? has an uncon¬ 
ditionally convergent expansion 


f='L(f,e k )ek- (2-7) 

k= 1 

The expansion property (2.7) is the main reason for considering orthonormal 
bases. In practice, orthonormal bases are certainly the most convenient bases to 
use: We will later see that, for other types of bases, the representation (2.7) has to be 
replaced by a more complicated expression. Unfortunately, the conditions for 
{e k } k= i being an orthonormal basis are strong, and often it is impossible to 
construct orthonormal bases satisfying extra conditions. 

The following theorem characterizes all orthonormal bases for in terms of an 
operator acting on an arbitrary orthonormal basis. 
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Theorem 2.8. Let {e{\°f =x be an orthonormal basis for Then the orthonormal 
bases for are precisely the sets {Uek\°f =l , where U : Jjf —> Jtjf is a unitary 
operator. 

Proof Let {fa}°f=\ be an orthonormal basis for Jtjf. Define the operator 

u{^c k e^ = £<;*/*, {c k )k= 1 e ^ 2 (N). 

Then U maps Jtjf boundedly and bijectively onto and fa = Ue £. For /,gG 
write / = X( f: e k)eic and g = Y,{s^ e k) e k\ then, via the definition of U and 
Theorem 2.6, 


(U*Uf,g) = (Uf,Ug) 

= (Y,{f,ek)fk,Y,{g,ek)fk) 

= Y,{f,ek){g,e k ) = {f,g)■ 

This implies that U*U = I. Since U is surjective, it follows that U is unitary. On the 
other hand, if U is a given unitary operator, then 

(Ue k ,Uej) = (U*Ue k ,ej) = {,e k ,ej) = 8 kJ \ 

i.e., {Uek}f =l is an orthonormal system. That it is a basis follows from Theorem 
2.6 and the fact that U is surjective. □ 


2.4 Riesz Bases 

In Theorem 2.8 we characterized all orthonormal bases in terms of unitary opera¬ 
tors acting on a single orthonormal basis. Formally, the definition of a Riesz basis 
appears by weakening the condition on the operator: 

Definition 2.9. A Riesz basis for is a family of the form {Uef\°£ =v where 
{ e k}°k= i * s an orthonormal basis for and U : is a bounded bijective 

operator. 

A Riesz basis {fa}°f = \ is actually a basis; this follows from the proof of Theorem 
2.10, which we state now. Note that the expansion (2.8) of elements / E Jf? in 
terms of a Riesz basis is more involved than the expression (2.7) we obtained via 
orthonormal bases: 

Theorem 2.10. If{fa}l 1 :1 is a Riesz basis for Jtf, then {fa}°f = \ a Bessel sequence. 
Furthermore, there exists a unique sequence {gk}k=i in such that 

f=i(f,g k )fk , V/eJT. 

k= 1 


( 2 . 8 ) 
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The sequence is also a Riesz basis, and the series (2.8) converges 

unconditionally for all f G 

Proof According to the definition, we can write {fk}°f=\ = {Uek}°f=v where U is 
a bounded bijective operator and {ek\f =l is an orthonormal basis. Now let / G Jjf. 
By expanding U~ l f in the orthonormal basis we have 

u~ l f= i(U- l f,e k )e k =j^(f,(U- l )*e k )e k . 
k=\ k= 1 

Therefore, with gk := (t/ -1 )*^, 

f = UU- l f= j^if^U-ye^Uek 
k=\ 

= Y,(f,gk)fk■ 

k= 1 

Since the operator (f/ -1 )* is bounded and bijective, {gk}k=i is a Riesz basis by 
definition. Furthermore, for / G Jrf?, 

£l(/,A)l 2 =£l(/,^)l 2 = l|t/7ll 2 (2.9) 

* =1 * =1 < llt/*ll 2 ll/ll 2 

= \\U\\ 2 \\f\\ 2 . (2.10) 

This proves that a Riesz basis is a Bessel sequence. Thus, the series (2.8) con¬ 
verges unconditionally by Corollary 2.4. We complete the proof by showing that the 
sequence {g^}^ =1 constructed in the proof is the only one that satisfies (2.8). For 
that purpose, we first note that if 


/=£c*(/)a=£<4(/)/* (2.ii) 

k=\ k= i 

for some coefficients Ck(f) and dk(f ), then necessarily Ck(f) = <4(/) for all k G N; 
this follows by applying the operator U~ l on both sides of the equality and using 
that {ek}k=i is known to be a basis. This argument shows that a Riesz basis actually 
is a basis. Now we only have to show that if {gk}k=i an d {hk}k=i are sequences in 
J4? such that 

/ = £ (f,8k)fk = £ (f,h k )fk, V/ € JT, (2.12) 

k= 1 k= 1 

then gk = hk for all k G N. However, due to the argument above, (2.12) implies that 
for all k G N, 

(f,gk) = (f,h), V/eJT; 


the desired result now follows. □ 
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The unique sequence {gk}k= 1 satisfying (2.8) is called the dual Riesz basis 
of {fk)k=\ • Let us find ^e dual of {g^}^ =1 . In the notation used in the proof of 
Theorem 2.10, we have that the dual of {fk}°k=\ — is given by 

te}r=i=((^“ 1 )^}r=i; 


thus, the dual of {g&}£ =1 is 

{(((£OT‘)*^}L = {^*>^1 = {/*}fc=i- 

That is, {A}^ =1 and {g^}^ =1 are duals of each other. For this reason, we frequently 
call {A}^ =1 and {g^ir=i a of dual Riesz bases. 

Already in the proof of Theorem 2.8, we saw that a Riesz basis is a Bessel 
sequence. For later use, we now state that it also satisfies some kind of “opposite 
inequality”; the proof is left to the reader (Exercise 5). 

Proposition 2.11. If {fk}°f=\ = {U e k}°k= l ^ a Riesz basis for there exist 
constants A, B > 0 such that 


A ||/|| 2 <il(/,/*}| 2 < B ll/ll 2 , V/eJT. (2.13) 

k= 1 


The largest possible value for the constant A is 1 /11 U 1 11 2 , and the smallest possible 
value for B is \\U\\ 2 . 


For completeness we finally state an equivalent characterization of Riesz bases. 
Several authors use condition (ii) as the definition of a Riesz basis; the proof of the 
equivalence with the definition used here can be found, e.g., in [246] or [49]. 


Theorem 2.12. For a sequence {fk}°f = \ in the following conditions are 
equivalent: 

1. {fk\k=\ is a Ri es z basis for Jff. 

2. {fk}°f = \ is complete in and there exist constants A,B > 0 such that for every 
finite scalar sequence {qJ, one has 


A T J \ c k\ 2 < 


V. c kfk 


2 <fl5>*| 2 . 


(2.14) 


A sequence {fk}°k=\ satisfying (2.14) for all finite sequences {ck}f =i is called a 
Riesz sequence. 

If (2.14) holds for all finite scalar sequences {c{\, then it automatically holds for 
all {c*}* =1 e see Exercise 6. If {A}^ =1 is a Riesz basis, numbers A,B > 0 

that satisfy (2.14) are called lower Riesz bounds , respectively, upper Riesz bounds. 
They are clearly not unique, and we define the optimal Riesz bounds as the largest 
possible value for A and the smallest possible value for B. 






2 B-Spline Generated Frames 

2.5 Frames and Their Properties 


59 


We are now ready to introduce frames. Frames were invented in 1952 by Duffin 
and Schaeffer [77], but it took several years before the potential was realized by the 
scientific community. By now, frame theory is well established; we will only give a 
glimpse of the general theory, and focus on the parts of the theory that are important 
for our later constructions based on B-splines. 

Definition 2.13. A sequence {fk}f=\ of elements in is a frame for if there 
exist constants A,B > 0 such that 

A ll/ll 2 <£|(/,A)| 2 < fi||/|| 2 , v/ejr. (2.15) 

k= 1 

The numbers A and B are called frame bounds. They are not unique. The optimal 
upper frame bound is the infimum over all upper frame bounds, and the optimal 
lower frame bound is the supremum over all lower frame bounds. Note that the 
optimal bounds actually are frame bounds. 

The following lemma shows that it is enough to check the frame condition on a 
dense set. The proof is left to the reader as Exercise 7. 

Lemma 2.14. Suppose that {fk}°f=\ ^ a sequence of elements in and that there 
exist constants A,B > 0 such that 

^ ll/ll 2 <£l(/Jd| 2 <5 ||/|| 2 (2.16) 

k= 1 

for all f in a dense subset V ofJif. Then {fk}^=i is a frame for with bounds A, B. 

A special role is played by frames for which the optimal frame bounds coincide: 

Definition 2.15. A sequence {fk}°k=\ in is a tight frame if there exists a number 
A > 0 such that 

£ \(f,fk)\ 2 =A\\f\\ 2 , Vf €Jf?. 

k= 1 

The (exact) number A is called the frame bound. 

Since a frame {fk}^ = \ is a Bessel sequence, the operator 

T : £ 2 (N) - JP, T{c k } k= i = £ c k f k (2.17) 

k= 1 

is bounded by Theorem 2.3; T is called the preframe operator or the synthesis 
operator. By Lemma 2.1, the adjoint operator is given by 


T*:JP^e 2 (N), T*f = {(f,f k )} k=1 - 


(2.18) 
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r* is called the analysis operator. Composing T and T*, we obtain the frame 
operator 

Sf=TT*f=j^(f,f k )f k . (2.19) 

k= 1 

Note that because * s a Bessel sequence, the series defining S converges 

unconditionally for all / G 34 ? by Corollary 2.4. We state some of the important 
properties of S ; proofs can be found, e.g., in [135] or [49]. 

Lemma 2.16. Let {fk}°k=\ be a frame with frame operator S and frame bounds A,B. 
Then the following hold: 

1. S is bounded, invertible, self-adjoint, and positive. 

2. {S~ l fk}f = i is a frame with frame operator S~ l and frame bounds B ~ l , A~ l . 

The frame {S~ l fk}f =l is called the canonical dual frame of {fk}°k=\ - The reason 
for the name will soon become clear; in fact, Theorem 2.17 will show that 
{S~ l fk}k = i plays the same role in frame theory as the dual basis in the theory of 
bases. 

The frame decomposition , stated in (2.20) below, is one of the most important 
frame results. It shows that if {fk}°k=\ * s a frame for 34 ?, then every element in 34 ? 
has a representation as an infinite linear combination of the frame elements. Thus, 
it is natural to view a frame as some kind of “generalized basis.” 


Theorem 2.17. Let {fk}k=\ be a frame with frame operator S. Then 


/= £o %s~ i f k )f k , 

k= 1 

v/e Jtr, 

( 2 . 20 ) 

and 



f=hfj k )s- i f k , 

V/e JT. 

( 2 . 21 ) 


k= 1 

Both series converge unconditionally for all f G 34?. 


Proof. Let / G 34?. Using the properties of the frame operator in Lemma 2.16, 
f = SS-'f=j^(S-'f,f k }f k =j^(f ) S-\f k )f k . 

k= 1 k= 1 

Because {fk}°k=\ * s a Bessel sequence and {{f,S~ l fk)}f =l G ^ 2 (N), the fact that the 
series converges unconditionally follows from Corollary 2.4. The expansion (2.21) 
is proved similarly, using that / = S~ l Sf. □ 

Theorem 2.17 shows that all information about a given vector / G 34? is 
contained in the sequence {{f,S~ l fk)}f =l . The numbers (/, S~ l fk) are called frame 
coefficients. 

Theorem 2.17 also immediately reveals one of the main difficulties in frame 
theory. In fact, in order for the expansions (2.20) and (2.21) to be applicable in 
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practice, we need to be able to find the operator S ~ l , or at least to calculate its 
action on all /*, k G N. In general, this is a major problem. One way of 
circumventing the problem is to consider only tight frames: 

Corollary 2.18. If {f k )Z= :1 is a tight frame with frame hound A, then the canonical 
dual frame is {A~ l fk}^ =v and 

V/eJT. (2.22) 

A k=l 

Proof If {fk\k=i is a tight frame with frame bound A and frame operator S , the 
definition shows that 

= £ I</,/*}I 2 = A ll/ll 2 = (A/,/), v/e Jt*. 

k= 1 

Since S' is self-adjoint, this implies that S = A/; thus, S' -1 acts by multiplication by 
A -1 , and the result follows from (2.20). □ 

By a suitable scaling of the vectors {fk}°f=\ i n a tight frame, we can always obtain 
that A = 1; in that case, (2.22) has exactly the same form as the representation via an 
orthonormal basis; see (2.7). Thus, such frames can be used without any additional 
computational effort compared with the use of orthonormal bases. 

Tight frames have other advantages. For the design of frames with prescribed 
properties, it is essential to control the behavior of the canonical dual frame, but the 
complicated structure of the frame operator and its inverse makes this difficult. If, 
for example, we consider a frame {fk}°f=\ f°r T 2 (M) consisting of functions with 
exponential decay, nothing guarantees that the functions in the canonical dual frame 
{5 ,-1 /&}r=i have exponential decay. However, for tight frames, questions of this 
type trivially have satisfying answers. Also, for a tight frame, the canonical dual 
frame automatically has the same structure as the frame itself: If the frame has a 
wavelet structure or a Gabor structure (see Sections 2.9-2.13), the same is the case 
for the canonical dual frame. In contrast, the canonical dual frame of a nontight 
wavelet frame might not have the wavelet structure. 

Later we will discuss another way to avoid the problem of inverting the frame 
operator S. In fact, for frames {fk}°f=\ that are not bases, we prove in Theorem 2.21 
that one can find other frames {gk}k=i than {S~ l fk}^ = 1? for which 

f=i(f,8k)fk, V/eJr. (2.23) 

k= 1 

Such a frame {gk}k=\ is called a dual frame of {fk}°k=v Now, there is a chance that 
even if the canonical dual frame is difficult to find, there exist other duals that are 
easy to find; or, that it is possible to find duals having more pleasant properties than 
the canonical dual. In Section 2.12 we discuss such cases. 

A note on terminology is in order. In Exercise 9 we ask the reader to prove that if 
felLi is a dual frame of {fk\k=v then {/*}£= i is also a dual of {gk}k=i- For tins 
reason, we will usually call {fk}°k=i an d {gk}k=i a P a ^ r of dual frames, or a dual 
frame pair, when (2.23) holds. 
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2.6 Frames and Riesz Bases 

As we have seen, a frame {fk\k=\ in a Hilbert space 34? has one of the main 
properties of a basis: Given / G 34?, there exist coefficients G such 

that / = Yk= l c kfk • This makes it natural to study the relationship between frames 
and bases. In this section we notice that all Riesz bases are frames and characterize 
the frames that are actually Riesz bases. 

Theorem 2.19. A Riesz basis {fk}k=\f or ^ ™ a frame for 34?. The dual Riesz basis 
equals the canonical dual frame {S~ l fk }J° =1 . 

Proof. By Proposition 2.11, a Riesz basis {fk}°k=\ for 34? is also a frame for 34?. 
The rest follows from the frame decomposition combined with the uniqueness part 
of Theorem 2.10. □ 

A frame that is not a Riesz basis is said to be overcomplete; in the literature, 
the term “redundant frame” is also used. Theorem 2.20 will explain why the word 
“overcomplete” is used: In fact, if {A }&=1 * s a frame that is not a Riesz basis, there 
exist coefficients {ck}°f=\ G ^ 2 (N) \ {0} for which 

!>*/* = 0. (2.24) 

k= 1 

That is, for such frames there is some dependency between the frame elements. 
Theorem 2.20. Let {fk}°k=\ b e a frame for 34?. Then the following are equivalent: 

1. {fk)k=\ ^ a Ri es z basis for 34?. 

2. tfYk= l c kfk = 0 for some {ck}°f = \ G ^ 2 (N), then Ck = 0, V& G N. 

Proof. 1^2: Assume that {fk]°k=\ * s a Riesz basis and that X^=i c kfk = 0 for a 
sequence {ck}^ =l G £ 2 (N). Writing {fk}°k=\ = {Uek}°f= \ for a certain orthonormal 
basis for 34? and an appropriate bounded bijective operator 34?, it follows that 


k= 1 

Because U is injective, this implies that X^=i c k e k = 0, and thus Ck = 0 for all k. 

2^1: Let {5^}^ =1 be the canonical orthonormal basis for £ 2 (N). Assumption (ii) 
assures that the preframe operator T associated with {A}^ =1 is injective, and T is 
also surjective because {fk\k=\ i s a frame. Since Tdk = A, Vfc, the result follows 
from the definition of a Riesz basis. □ 

Much more can be said about the relationship between frames and Riesz bases; 
the following result and the proof are borrowed from [135]. 

Theorem 2.21. Assume that {A}I°=i an overcomplete frame. Then there exist 
frames {#*}£=! ^ {S~ l f k }^ =l for which 
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/= J^(f,gk)fk, V/Gjr. (2.25) 

k= 1 

Proof. We split the proof in two cases and assume first that fe = 0 for some 
£ E N; in this case S~ l fy = 0. Letting g * := for /: 7 ^ £ and choosing to 

be any arbitrary nonzero vector, the frame decomposition shows that (2.25) holds; 
and, clearly, {g k } k=x ± {S~ l f k } k= v 

Now we consider the case where A 7 ^ 0 for all /: E N. By Theorem 2.20, there 
exists a sequence {c ^}^ =1 E ^ 2 (N) \ {0} such that 


0 

fc=i 


For a certain ^ E N, we have q 7 ^ 0, and we can write 


fr¬ 


et 


Ya C kfk- 


k^e 


We now show that {fk}k^£ is a frame for we only have to prove that {fkjk^i 
satisfies the lower frame condition. In order to do so, observe that for any / E 
the Cauchy-Schwarz inequality shows that 


\(f,m 2 = 


< 


—"X c *(/>/*> 

c k k^e 


2 



Xm 2 Ik/,a> i 2 


= cXl(/,A)| 2 , 

k±t 


where C := (1 /|q| 2 ) Y,k^£ \ c k\ 2 - Letting A denote a lower frame bound for the frame 
{fk K°=i’ this implies that 

aii/ii 2 <£k/,a)i 2 

k—l 

= XI(/,A)I 2 + K/,A)I 2 

k^£ 

<(i+c)Xl(/,A)| 2 . 

k^£ 

This shows that {A}ik^ indeed satisfies the lower frame condition. 

Denoting the canonical dual frame of {fk}k^£ by {gk}k^£ and defining g£ = 0, we 
have found a frame {gk}k=i f° r which (2.25) holds; it is different from the canonical 
dual of {fk}k=i because S ' -1 A 7 ^ 0 . □ 
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2.7 B-Splines 

So far, we have considered frames in general Hilbert spaces. Our purpose is now 
to consider concrete frames in L 2 (M). We will focus on frames generated by 
B-splines, so in this section we recall the basic properties for these functions. For 
more information on B-splines (and more general splines), we refer to the books by 
de Boor [28] and Chui [51]. 

The B-splines are defined inductively: The first is simply 


N\(x) = *[ 0 ,i]0), 


(2.26) 


and, assuming that we have defined N n for some n £ N, the next is defined by a 
convolution: 



(2.27) 


r o 


The functions N n defined by (2.26) and (2.27) are called B-splines , and n is the 
order. See Fig. 2.1 for graphs of the first few 5-splines. We collect some of their 
fundamental properties; all of the results can be proved by induction (Exercise 13). 


1 J 



1 - 



-2 -1 0 


1234-2-10 


12 3 4 


Fig. 2.1: The B-splines N2 and N3, respectively. 


Theorem 2.22. Given n £ N, the B-spline N n has the following properties: 

1. suppA^ = [0 ,n\ and N n > 0 on (0 ,n). 

2. JZo (x) dx= 1 . 
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3. For n> 2, 

^ N n (x — k) = 1 /ora/ZxEM; (2.28) 

k^z'Z 

for n= 1, formula (2.28) holds for a.e. x E M. 

4. Fbr /ieN, 

— /1 - e~ 2ni ^\ " 

w "M=Nsr)- (229) 

We will now consider a centered version of the discussed B-splines. For n E N, let 

£„(x) := T_nN n (x) =N n (x+^j . (2.30) 

We will also call the functions B n for B-splines. Alternatively, one can define 
these functions by 


^1 :— Bn+\ •—Bn * ^ 1 5 ^ E N. 
Thus, for any n E N, we have that 

r l 2 

B n +\(x) = / j B n (x — t) dt. 


(2.31) 


It is clear that B n has support on the interval [ —n/2, n/2], We state the following 
direct consequences of Theorem 2.22: 


Corollary 2.23. For n E N, B-spline B n has the following properties: 

1. For n> 2, 

^ B n (x — k) = 1 /or o//x E M. 

For n = 1, the formula holds for a.e. x E ®L 


2. B„(y) = 


2 ^/y y 


/ sin(^y) \ w 

V y * 


2.8 Frames of Translates 

We will now start the approach to the explicit construction of frames in L 2 (M). Our 
focus will be on frames having Gabor structure or wavelet structure, to be discussed 
in Sections 2.9-2.13; however, both of these types of systems involve translation of 
a fixed function, so we first give a short presentation of such systems. 

For b E M, define the translation operator : L 2 (M) —> L 2 (M) by 


( T b f){x) = f(x-b ), xeM. 
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We will consider systems of functions in L 2 (M) of the form {7^0 where 0 is a 

fixed function. Our goal is to state a characterization of Riesz sequences and frames 
of this form. 

Associated with a given function </) E L 2 (M), we will consider the function 

®(r) = X |<?(r+fc)| 2 , re»- (2.32) 

We will state the announced characterization of the frame properties for 
in terms of the function & associated with </). It was first proved by Benedetto and 
Li in [18]. We state the result without proof: 

Theorem 2.24. Let </) E L 2 (M). For any A,B > 0, the following characterizations 
hold: 

1. {Tkty }kez is an orthonormal sequence if and only if 


<P(y) = l, a.e. ye [0,1], 

2. }kez Is a Riesz sequence with bounds A, B if and only if 

A < 0(y) < B , a.e. y E [0,1]. 

3. is a frame sequence with bounds A, B if and only if 

A < 0{y) < a - e • 7 C [0,1] \A, 


where N = {y E [0,1] :<£(y)=0}. 

As a very important consequence of Theorem 2.24, we now prove that the 
integer-translates of any B-spline form a Riesz sequence. We formulate the result 
for the symmetric B-splines B n defined in (2.31), but the same result holds for the 
B-splines N n in (2.27). 

Theorem 2.25. For each n £ N, the sequence {T^B^^z Is a Riesz sequence. 


Proof. For n = 1, {TkB\}^z is an orthonormal system, and therefore a Riesz 
sequence. In order to prove the result for n > 1, we apply Theorem 2.24 to B\\ 
this shows that 


Z My+k) 


kC.7L 


1 , 


a.e. y E M. 


Since \B\ (y) | < 1 for all y E M and B n (y) = (B\ ( y)) n by Corollary 2.23, it immedi¬ 
ately follows that 


I 

k(EZ 




<i 

kcZ 


Bi(y+k) 


1, a.e. yEM. 


Thus, {TkB n }k e z is a Bessel sequence. In order to prove that {TkB n }k e % satisfies 
the lower Riesz basis condition, we again use Corollary 2.23: It shows 
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that, for a.e. 7 G M, 


X ^n(y+£) 




> 


inf 

y e l~bi 1 


^n(y) 


/ sin(7r/2)\ / 2 

V tt/ 2 y _ U 


(2.33) 


The result now follows from Theorem 2.24. □ 


2.9 Basic Gabor Frame Theory 

We are now ready to approach the analysis of Gabor systems. Consider for b G M 
the modulation operator 

E b : L 2 (K) -> L 2 (M), (£*/)(*} = e lKibx f{x). 

A collection of functions on the form {E m bT na g} m ^ n(E z is called a Gabor system. 
Explicitly, these functions have the form 

E m bT na g(x) = e 2mmbx g(x-na). 

Systems of the Gabor type play a role in time-frequency analysis. In this chapter 
we will focus on the frame properties for Gabor systems, in particular, for the case 
where g is a B-spline. For a broader view on Gabor systems, we refer to the book 
by Grochenig [108], as well as the collections of research papers in the books [ 86 ] 
and [87] edited by Feichtinger and Strohmer. 

Our purpose is to consider frames for L 2 (M) having the Gabor structure: 

Definition 2.26. A Gabor frame is a frame for L 2 (M) of the form {E m ijT na g} m:ne i, 
where a,b> 0 are given and g G L 2 (M) is a fixed function. 

Frames of this type are also called Weyl-Heisenberg frames. The function g is 
called the window function or the generator. 

Gabor systems play a role in the context of time-frequency analysis. Although 
bases of the Gabor type exist (take, e.g., a = b= 1 and g = they are not well 

suited for the purpose of the time-frequency analysis of functions. One reason is that 
the generator g cannot be particularly nice; for example, it is known that no contin¬ 
uous and compactly supported function g can generate a Gabor basis, regardless of 
the choice of the parameters a and b. Another reason is that it is impossible to have 
a Gabor basis generated by a function g with fast decay in the time domain and the 
frequency domain. This is the famous Balian-Low theorem: 

Theorem 2.27. Let g G L 2 (R). If {E m T n g} m A a Riesz basis for L 2 (R), then 


(/_>MI 2 ^) (yj_Jyg{y)\ 2 dy 


( 2 . 34 ) 
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The Balian-Low theorem implies that if {E m bT na g} m , n(E z is a Gabor-Riesz basis, 
then it is not possible that g and g satisfy decay conditions like 

l*(*)l^7T~2» xeR ’ l£(r)l < 7T--2> reR ’ 

i + i + y 

simultaneously. The reader can find a proof of the Balian-Low theorem in [59]. 

This discussion motivates the analysis and construction of Gabor frames: As we 
have seen, frames lead to expansions that are somewhat similar to the expansions 
obtained via bases, at least if we restrict our attention to tight frames or to frames 
for which convenient duals can be found. We will present such constructions in the 
next sections. 

We now state a proposition that gives a necessary condition for {E m t)T na g} m ^ ne z 
to be a frame for L 2 (M). It depends on the interplay between the function g and the 
translation parameter a and is expressed in terms of the function 

G(x) := y \g(x-na)\ 2 , xeR. (2.35) 

yiClZ 

The proof can be found, e.g., in [135] or [49]. 

Proposition 2.28. Let g £ L 2 (M) and a,b > 0 be given, and assume that the collec¬ 
tion of functions {E m fyT na g} mne z is a frame with bounds A, B. Then 

bA<^\g{x — na)\ 2 <bB , a.e. x £ M. (2.36) 

n€zZ 

More precisely: If the upper bound in (2.36) is violated, then {E m bT na g} m , n(E z is not 
a Bessel sequence; if the lower bound is violated, then {E m hT na g} m ^ ne z does not 
satisfy the lower frame condition. 

It follows from Proposition 2.28 that a function g generating a Gabor frame 
{E m bT na g} m , ne z necessarily is bounded. Note also that Proposition 2.28 gives a 
relationship between the frame bounds and the lower and upper bounds for the 
function G in (2.35). In Corollary 2.32 we will see that under certain circumstances, 
the necessary condition (2.36) is also sufficient for {E m bT na g} m , ne z to be a frame 
for L 2 (M). 

If we want to check that a Gabor system {E m bT na g} m:ne % forms a frame by hand, 
we need to be able to estimate the expression | (/, E mb T na g) | 2 for all functions 

/ belonging to L 2 (R) (or at least a dense subset thereof). Under certain conditions 
on the functions / and g , we can find an explicit expression for this infinite sum. 
The next statement is taken from [44], but we notice that similar results already 
appear in [59]. 

Lemma 2.29. Suppose that f is a bounded, measurable function with compact 
support and that the function G defined by (2.35) is bounded. Then 

X \(f’ E mbTnag}\ 2 = \ f |/0)| 2 X \g(x-na)\ 2 dx 

m,n£Z J—oo n ^ 

\ _ _ 

+ / f( x )f(x — k/b) ^ g(x — na)g(x — na — k/b)dx. 

& kjto J -°° neZ 
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Lemma 2.29 has several important consequences. For example, as shown in [44], 
it leads to a sufficient condition for {E m bT na g} m , ne z to form a frame (the original 
proof can be found in [207]): 

Theorem 2.30. Let g E L 2 (M), a,b > 0, and suppose that 


b '=t sup ] X 

°xe[0,a] Uez 


^ g{x — na)g(x — na — k/b) 


°°. 


(2.37) 


Then {E m i ? T na g} mn(E z is a Bessel sequence with bound B. If also 


A:= I 'U f 7 X \g(x-na) 
bxe[0,a] [ neZ 

> 0 , 


I 

b£0 


^ g{x — na)g(x — na — k / b) 


(2.38) 


{E m hT na g} mne z is a frame for L 2 (M) with bounds A,B. 

Condition (2.37) leads to an easy, sufficient condition for {E m bT na g} m , ne z to be 
a Bessel sequence (Exercise 18): 

Corollary 2.31. Let g E L 2 (M) be bounded and compactly supported. Then the 
collection of functions {E m ijT na g} m ^z is a Bessel sequence for any choice of 
a,b> 0. 

The condition that the function g is bounded and compactly supported is not 
sufficient for {E m bT na g} m , ne z to be a frame: In fact, as shown in Proposition 2.28, 
the associated function G in (2.35) also needs to be bounded below and above. On 
the other hand, for a function g with compact support, the condition that the function 
G is bounded below and above for some a > 0 is enough for {E m bT na g} m ^ ne z to be 
a frame for sufficiently small values of b. We also obtain expressions for the frame 
operator and its inverse in this case: 

Corollary 2.32. Let a,b > 0 be given. Suppose that g E L 2 (M) has support in an 
interval of length l/b and that the function G satisfies (2.36) for some A,B > 0. 
Then {E m i>T na g} m , ne z is a frame for L 2 (M) with bounds A, B. The frame operator S 
and its inverse S~ l are given by 

Sf= f/, S"V=|/, /eL 2 (M). 

Proof. That {E m bT na g} m , ne z is a frame follows directly from Lemma 2.29 or 
Theorem 2.30 because 

^ g(x — na)g(x — na — k/b) = 0 for all k 0. 

ne Z 
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Given a continuous function / with compact support, Lemma 2.29 implies that 

(Sf,f)= X \(f’ E mbTnag)\ 2 = \ f \f(x)\ 2 G(x) dx = (jf,f 

m,neZ t> J-oo \D 

By the continuity of S , this expression even holds for all / £ L 2 (M). It follows that 
S acts by multiplication with the function G/b. □ 

For a continuous function g, we can be even more explicit. We leave the proof of 
the following result to the reader (Exercise 22). 

Corollary 2.33. Suppose that g^L 2 (M) is a continuous function with support on an 
interval I with length |/| and that g(x) > 0 on the interior of I. Then {E mb T na g} m ^ ne x 
is a frame for all (a,b) G (0, |/|) x (0,1/|/|]. 

In particular, this result applies to the B-splines. In order to avoid a conflict with 
our notation for a Gabor system, we will denote the splines by Bp and Np, i G N, 
instead of B n and N n . 

Corollary 2.34. For IgN, the B-splines Bp and Np generate Gabor frames for all 

(fl,fc)G(0,^)x ( 0 , 1/4 

One might wonder whether the Gabor system {E mb T na Bp} m , ne z is a frame for 
(a,b) £ (0,£) x (0,1/4 Surprisingly, only a few partial results are known. For the 
B-spline B^ , it is proved in [117] that if b G N \ {1}, the Gabor system cannot form 
a frame for any a > 0; see also Exercise 19. 

The results discussed so far concentrate on the interplay between the function 
g and the parameters a,b. For completeness, we mention a central result in Gabor 
analysis, although it does not have a direct influence on the result presented here: 
It shows that, regardless of the choice of generator g G L 2 (M), the choice of the 
parameters a and b puts certain restrictions on the possible frame properties for 

{E m b T na g } m,n£Z • 

Theorem 2.35. Let g G L 2 (M) and a,b > 0 be given. Then the following hold: 

1. Ifab > 1, then {E m bT na g} m , ne % cannot be aframeforL 2 {R). 

2- If {E m jyT na g} m is a frame, then 

ab= 1 Gv- {E mb T na g} m , ne z is a Riesz basis. (2.39) 


2.10 Tight Gabor Frames 


In applications of frames, it is inconvenient that the frame decomposition, stated 
in Theorem 2.17, requires inversion of the frame operator. As we have seen in the 
discussion of general frame theory, one way of avoiding the problem is to consider 
tight frames. We will now characterize tight Gabor frames; the first result is taken 
from [45]. 
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Theorem 2.36. Let g G L 2 (M) and a,b > 0 be given. Then the following are 
equivalent: 

I- {E m bTnag\m. )Ue z I s a tight frame for L 2 (M) with frame bound A = 1. 

2. For a.e. xGl, the following conditions hold: 

a. G(x) := Xnez \g(x ~ na)\ 2 = b; 

b. Gk(x) :=^ ne ^g(x — n a)g( x — na — k/b) = Oforallk 7 ^ 0 . 


Proof 1 => 2: Assume that {E m bT na g} m ^ ne z is a tight frame for L 2 (M) with frame 
bound A = 1. Then Proposition 2.28 shows that G(x) = b for a.e. x G M. Therefore, 

X \(fi E mbTnag )\ 2 = \ [ \f{x)\ 2 G(x) dx 

m,n£ Z 7—00 

for all functions / G L 2 (M). Using Lemma 2.29, we conclude that for all bounded, 
compactly supported / G L 2 (M), 

rX / f{x)f{x-k/b)S.g{x-na)g{x-na-k/b)dx = Q. 
b k& J — neZ 


A change of variable shows that the contribution in the above sum arising from 
any value of k G Z is the complex conjugate of the contribution from the value —k. 
Therefore, 


X Re 



f(x)f(x - k/b) £ g(x - na)g(x - na 


-k/b)dx j = 0 . 


(2.40) 


Now fix &o > 1 and let / be any interval in M of length at most 1 /b. Define a function 

/GL 2 (R) by 


^-iarg(G, 0 W) for XG/ 


/(*) = < 


1 

0 


for xG/ + io/^ 
otherwise. 


Then, by (2.40), 


0 =£Re( f f(x)f(x — k/b)^g(x — na)g(x — na — k/b)dx 
k= 1 V _0 ° rceZ 

= Re QT /£)/(*-(jc) =fG ko (x)\ dx. 


It follows that G^ 0 (x) = 0 for a.e. x G /. Since / was an arbitrary interval of length 
at most l/b, we conclude that G ^ 0 = 0. In order to deal with G k for k < 0, a direct 
computation shows that 


G-k 0 M = G ko (x + ko/b) = 0 ; 

this shows that statement 2b indeed holds for all k 0 . 
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2 => 1: The assumptions in 2 imply, again by Lemma 2.29, that for all bounded, 
compactly supported functions / G L 2 (M), 

X \(f, E mbTnag )\ 2 = \ f \f(x)\ 2 '£\g(x-na)\ 2 dx=\\f\\ 2 . 

m,n£Z 7—00 n ^ 

Since the bounded, compactly supported functions are dense in L 2 (M), Lemma 
2.14 implies that {E m bT na g} m ^ n(E z is a tight frame with frame bound A = 1, as 
desired. □ 

In general, it is not easy to construct functions g such that the conditions in 
Theorem 2.36 (2) are satisfied for some given a,b > 0. A simplification occurs if 
we assume that g has compact support: In that case, condition 2b is automatically 
satisfied for sufficiently small values of the parameter b. In particular, we obtain the 
following very useful sufficient condition for {E m t,T na g} m ^ ne z being a tight Gabor 
frame. We ask the reader to provide the proof in Exercise 23. 

Corollary 2.37. Let a,b > 0 be given. Assume that (p G L 2 (M) is a real-valued, 
nonnegative function with support in an interval of length l/b, and that 

^ (p(x + na) = 1, a.e.xeR. (2.41) 


Then the function 

g(x) := s/byix) 

generates a tight Gabor frame {E m hT na g} m ^ ne z with frame bound A = 1. 

If (2.41) is satisfied, we say that the functions {T na (p} n eZ form a partition of unity. 
In particular, we can apply the result to B-splines: 

Example 2.38. For any £ G N, the B-spline cp = N# defined in (2.26) satisfies the 
requirements in Corollary 2.37 with a = 1 and any b G (0,1/^]. Thus, for any 
b G (0, l/£], the function 

g(x) = VbN^x) 

generates a tight Gabor frame {E m bT n g} m ^z with frame bound A = 1. 

We note that the frame generators in Example 2.38 are very suitable for 
time-frequency analysis: They are given by an explicit formula, have compact sup¬ 
port, and can be chosen with polynomial decay of any desired order in the frequency 
domain, simply by taking the parameter £ sufficiently large. 


2.11 The Duals of a Gabor Frame 

For a Gabor frame {E m bT na g} m , ne % with associated frame operator S , the frame 
decomposition (see Theorem 2.17) shows that 

/= X </, S~ l E mb T na g)E mb T na g, V/GL 2 (R). 


(2.42) 
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In order to use the frame decomposition, we need to be able to calculate the 
canonical dual frame {S~ l E mb T na g} mjne %. This is usually difficult. Via the follow¬ 
ing lemma, we will be able to obtain a simplification; for the proof, we refer, to [49]. 

Lemma 2.39. Let g^L 2 (M) and a,b>0be given, and assume that {E mb T na g} m ^ ne z 
is a Bessel sequence with frame operator S. Then the following hold: 

1. SE mb T na = E mb T na S for all m,neZ. 

2- If {E mb T na g} m is a frame, then 

S~ l E mb T na = E mb T na S~ l , Vm,n G Z. 

Lemma 2.39 has important consequences for the structure of the canonical dual 
frame of a Gabor frame: 

Theorem 2.40. Let g E L 2 (M) and a, b > 0 be given, and assume that the collection 
of functions {E mb T na g} mn<E z is a Gabor frame. Then the canonical dual frame also 
has Gabor structure and is given by {E mb T na S~ l g} m , n<E z- 

Via Theorem 2.40, the frame decomposition (2.42) associated with a Gabor 
frame {E mb T na g} m , neZ takes the form 

/= X (f,E mb T na S- l g)E mb T na g, V/gL 2 (K). (2.43) 

In practice, this version of the frame decomposition is much more convenient than 
(2.42): Instead of calculating the double infinite family {S~ l E mb T na g} m ne %, it is 
enough to find S~ l g and then apply the modulation and translation operators. The 
function S~ l g is called the dual window function or the dual generator. 

We will now leave the discussion of the canonical dual frame and examine the 
question of how general dual frames of a given Gabor frame {E mb T na g} m ^ ne z can 
be found. Our analysis is based on a fundamental result due to Ron and Shen [207], 
respectively, Janssen [146]; the technical proof can be found in the original papers 
or in [49]. 

Theorem 2.41. Let g,h E L 2 (M) and a,b > 0 be given. Two Bessel sequences 
{E mb T na g} m , n e z and {E mb T na h} m , ne zforrn dual frames if and only if 

^ g(x — n/b — ka)h(x — ka) = b8 n fi, a.e. jcE [0,a]. (2.44) 

kCiZt 


2.12 Explicit Construction of Dual Gabor Frame Pairs 

So far, we have only seen a few examples of Gabor frames and their dual frames. 
After the preparation in Section 2.11, we are now ready to provide explicit construc¬ 
tions of certain Gabor frames and some particularly convenient duals. The assump¬ 
tions are tailored to the properties of the B-splines. The results presented here first 
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appeared in [48]. For convenience, we first consider the case where the translation 
parameter is a = 1 : 

Theorem 2.42. Let iVeN. Let g E L 2 (M) be a real-valued, bounded function with 
supp g C [O^N], for which 

^g{x — k) = 1, xGi. (2.45) 

Let b E (0,1/ (2A — 1) ]. the function g and the function h defined by 


N -1 

h[x) = *g(jc) + 26 X g(x + k) (2.46) 

*=1 

generate dual frames {E mb T n g} mAeZ and {E mb T n h} mAeZ for L 2 (M). 

Proof By assumption, the function g has compact support and is bounded; by 
definition (2.46), the function h shares these properties. It now follows from 
Corollary 2.31 that {E mb T n g} m ^ neZ and {E mb T n h} mtneZ are Bessel sequences. In 
order to verify that these sequences form dual frames, we use Theorem 2.41: 
According to (2.44), we need to check that for v E [0,1], 

^ g(x —n/b — k)h(x — k) = b8 n £. (2.47) 

The function g has support in [0,7V], so by construction h has support in 
[ —N + 1,7V]; thus, (2.47) is satisfied for n 0 whenever 1 /b > IN — 1, i.e., if 
b E (0,1/(2 N — 1) ]. For n = 0, condition (2.47) means that 

^ g(x —k)h(x —k) = b, iE [0,1]; 

kC.'Z 

because of the compact support of g, this is equivalent to 

N-l 

^ g(x-\-k)h(x-\-k) = b, v E [0,1]. (2.48) 

k =0 

Condition (2.48) is indeed satisfied in our setting. To see this, we use that for 

x E [0,1], 

N-l 

1 = X §( x+k )• 

k =0 

This implies that, again for v E [0,1], 

i = (s *(■>+*)) 

= (g(x) +g(x+ 1) H- )-g(x + N- 1)) 

x (g{x) +g(x+ 1) H-bg(x + IV- 1 )) 
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— g(x) feW + 2g(x + 1) + 2g(x + 2) H-l-2g(x +N — 1)] 

+ g(x + l)L?(x + l) + 2g(x + 2) + 2g(x + 3)H- V2g(x + N- 1)] 

+ g( x + 2) [g{x + 2) + 2g(x + 3) + 2g(x + 4) H- \-2g(x + N- 1)] 


+ ••• 


+ g(x + N—2) [g(x + N — 2) + 2g(x + N— 1)] 
+ g(x + Af-l)[g(x + Af-l)] 

^ N—\ 

T X g(x + k)h(x + k). 

V b —n 


k =0 


Thus, condition (2.48) is satisfied. □ 

The assumptions in Theorem 2.42 are tailored to the properties of the B-splines 
N£ defined in (2.26): 

Corollary 2.43. For any i E N and b E (0,1/ {21 — 1) ], the functions Nf and 


hi(x) := bNf(x) + 2b^Ni(x + k ) 


(2.49) 


k= 1 


generate dual frames {E mb T n N e } mtne z and {E mb T n h(} m} „ eIl for L 2 (R). 

Some of the important features of the dual pair of frame generators (N^hf) in 
Corollary 2.43 are as follows: 

1. The functions Nf and hf are splines for all choices of £ E N; 

2 . Nf and hf are explicitly given functions with compact support, i.e., they have 
perfect time-localization; 

3. By choosing IgN sufficiently large, polynomial decay of N# and hf of any de¬ 
sired order can be obtained. 

Example 2.44. For the B-spline 


x G [0,1), 
x G [1j2), 
x £ [0,2), 


x, 

N 2 ( x ) = 2 -x, 


we can use Corollary 2.43 for Z? E (0,1/3]. For b = 1 / 3, we obtain the dual generator 



|(x+ 1), x G [ — 1,0), 

2(2 -x), x E [0,2), (2.50) 

0, x ^ [ —1,2). 


See Fig. 2.2, which also shows a similar construction based on N 2 . 
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(a) (b) 

Fig. 2.2: (a) The B-spline N 2 and the dual generator ti 2 in (2.50); (b) the B-spline N$ and the dual 
generator h 3 in (2.46) with b = 1/5. 


Via a scaling, one can obtain a version of Theorem 2.42 that is valid for any 
translation parameter a > 0; see [48] for details. 

The reader will notice that even though the B-splines Ni are symmetric, the 
constructed dual generators are not. We state without proof a recent result, due to 
Christensen and Kim [50], that gives freedom to choose various duals: 

Theorem 2.45. Let N E N. Let g E L 2 (M) be a real-valued, bounded function with 
supp g C [0 ^N], for which 

Y^g{x-n)= 1. 

Let b E ( 0,1/(2 N — 1) ]. Consider any scalar sequence \ for which 

a$ = b and a n -\-a- n = 2Z?, n— 1,2,... ,V — 1, (2.51) 


and define h E L 2 (M) by 


N -1 

h(x) = ^ a n g(x + n). (2.52) 

n=-A^+l 

Then g and h generate dual frames {E m i,T n g} m ^ ne z and {E m jjT n h} m ^j J for L 2 {M,). 

In particular, if the generator g is symmetric, it is possible to construct a 
symmetric dual generator: 

Corollary 2.46. Under the assumptions in Theorem 2.45, the function 

N -1 

h(x) = b y g(x + n) 

n=-N +1 


(2.53) 
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generates a dual frame of {E m ijT n g} m:n ^ The function h satisfies that h = b on the 
support of g. Furthermore, if g is symmetric, then h is symmetric. 

See Fig. 2.3 for an illustration of the dual generators, based on the B-splines 
N 2 andA^. 




(a) (b) 

Fig. 2.3: (a) The B-spline N 2 and the dual generator in (2.53) for b = 1/3. (b) The B-spline N 3 
and the dual generator in (2.53) for b = 1/5. 


2.13 Wavelets and the Unitary Extension Principle 

Classical wavelet theory deals with the construction of orthonormal bases for L 2 (M) 
of the form {2^ 2 \f/(2 j x — k)}j^e z f° r a suitable function y/ £ L 2 (M). Introducing 
the dilation operator 

D : L 2 (R) -► L 2 (R), (Df) (.x ) = 2 1 / 2 /(2x), 

the wavelet system takes the form 


{2 j / 2 \]/(2 j x - k )}= {D j T k \f/}j^ e z. 


Most wavelet constructions are based on the so-called multiresolution analysis, 
invented by Mallat in 1989; soon after that, Daubechies [59] presented her famous 
construction of compactly supported wavelets. We will not go into a discussion of 
all the aspects of classical multiresolution analysis, but refer to Daubechies’ book 
[60]. Our focus will be on a more recent construction of tight wavelet frames, based 
on B-splines. 

It the context of B-splines, Battle and Lemarie have proven that for any (centered) 
B-spline B m , it is possible to construct an orthonormal basis {D^Tkif/} forL 2 (M) 
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for a function i/a of the form 

V(x) = X c k B m (2x-k). (2.54) 

However, except for the case of the first B-spline the sequence { ck}keZ is 
infinite; and one can prove that no orthonormal basis construction for a func¬ 
tion i ff of the form (2.54) with a finite sequence { ck}kez is possible. This 
motivates the result presented in the current section: In fact, we will construct 
tight frames of wavelet structure, but based on two (or more) generators of the 
form (2.54). 

Our aim is to state the unitary extension principle of Ron and Shen [206], which 
enables us to construct tight frames for L 2 (R) of the form {D^Tkif/i} 
after doing so, we will show how to construct frames based on B-splines. We follow 
the approach by Benedetto and Treiber [20]. 

The following proofs are based on standard Fourier analysis for 1-periodic func¬ 
tions. It will be convenient to write the integrals appearing, e.g., in the expression 
for the Fourier coefficients and in Parseval’s equation, as integrals over the inter¬ 
val (—1/2,1/2) rather than (0,1). The interval (—1/2,1/2) is identified with the 
torus T, and the class of 1-periodic functions on R whose restriction to (—1/2,1/2) 
belongs to L p {— 1/2,1/2), p = 1,2, is denoted by L p ( T). Similarly, L°°(T) con¬ 
sists of the bounded, measurable 1-periodic functions on R. With this notation, 
L°°(T) C L 2 (T). We note that the spaces L P (T) actually consist of equivalence 
classes of functions that are identical almost everywhere, so when we speak about 
pointwise relationships between functions, it is understood that they can only be 
expected to hold almost everywhere. 

We now list the standing assumptions and conventions for this section. 

General setup: Let \j/o e L 2 (R) and assume that 
1. There exists a function Ho e L°°(T) such that 

2y) = H 0 (y)\jk(y). (2.55) 


2. lim r _^o V'b(y) = 1. 

Further, let Hi,. e L°°(T), and define i//|,.... t//„ e L 2 (R) by 

V^(2y) =H e {y)\fo(Y), e=l,...,n. 


Finally, let H denote the (n + 1) x 2 matrix-valued function defined by 


H(y) = 


Ho(r) T l/2 Ho(y)\ 

tfi(y) T l/2 Hi(y) 


y e R. 


\H n (y) T l/2 H n (y)J 


(2.56) 


(2.57) 


With this setup, our purpose is to find conditions on the functions such 

that defined by (2.56) generate a multiwavelet frame for L 2 (R). It turns 
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out to be convenient to formulate the results in terms of the matrices H(y), y E ®L 
Note that if we know the functions Hp, then we can find an explicit expression for 
the functions i j/f. In fact, expanding Hi in a Fourier series, H^y) = 'LkeZ c k/ e2nikr , 
elementary manipulations with the Fourier transform show that 


We(x) = X c k,lDT- k y, o(x) = 2 y c k/ y/ 0 (2x + k). (2.58) 

k<E Z keZ 

Recall that we prefer the functions to be trigonometric polynomials: This implies 
that the sums in (2.58) are finite and therefore that the functions \j/£ have compact 
support if Xf/Q has compact support. 

We are now ready to formulate the unitary extension principle; the quite compli¬ 
cated proof can be found in the original paper [206] by Ron and Shen, in the paper 
[20] by Benedetto and Treiber, or in [49]. 

Theorem 2.47. Let { i /^,//^}^ =0 be as in the general setup above, and assume that 
H(y)*H(y) = I for a.e. yE T. Then the multiwavelet system {D^T^Xj/i} j,keZ,£=i,...,n 
constitutes a tight frame for L 2 (M) with frame bound equal to l, and 

f = X Z X (fiD j T k y e )DiT k xi/ e , V/ G L 2 (K). (2.59) 

£= 1 jCiZkCiZ 

The matrix H(y)*H(y) has four entries, so at first glance it seems that we have 
to solve four scalar equations in order to apply Theorem 2.47. However, it turns out 
that it is enough to verify two sets of equations (Exercise 24): 

Corollary 2.48. Let be as in the general setup given above, and assume 

that 

Il^(r)l 2 = i, 

< n (=0 (2.60) 

YHM)T l/2 H t {y)= 0, 
l £=Q 

for a.e. y E T. Then the multiwavelet system {D^Tkgff\jconstitutes a 
tight frame for L 2 (M) with frame bound equal to 1. 

As an application of Corollary 2.48, we show how one can construct com¬ 
pactly supported, tight multi wavelet frames based on B-splines. In contrast with 
the Battle-Lemarie wavelets, the generators will be finite linear combinations of the 
type (2.54), and thus have compact support. 

Example 2.49. For any m= 1,2,..., we consider the B- spline 

Vo • = B lm 

of order 2m as defined in (2.31). By Corollary 2.23, 
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It is clear that lim r _^o Vq(y) = 1- Furthermore, the result in Exercise 17 shows that 

y/o(2y) = cos 2m (7ry)v^(y). 

Thus, xj/q satisfies a refinement equation with the two-scale symbol 


H 0 (y) = cos 2m {ny). 


(2.61) 


Note that Exercise 17 also explains why we are restricting ourselves to the case of 
even-order B-splines. Now, consider the binomial coefficient 


2 m 


(2m)! 


2m. (2.62) 


i ) ' ( 2 m-tjUV 

and define the functions H \,... ,H 2 m G L°°(T) by 

Heir) = (/”) sm e (Ky)cos 2m - e (Kr), £=1,..., 

Using that cos(/r(y— 1/2)) = sin(/ry) and sin(7 r(y— 1/2)) = — cos(/ry), it follows 
that 

Ti/ 2 H((y) = y (^^i-lYcos e (nY)sin 2m - e {nY), £=1,...,2m. (2.63) 

Thus, the matrix H in (2.57) is given by 


H{y) 


Hoir ) t 1/2 h q (y)\ 

Hiir) t^h^y) 


\H 2 m(Y) T l/2 H 2m (Y)J 

cos 2m ( 7 ty) 


]j Cr) sin ( 7r r)cos 2m 1 (jiy) 


^ j sin 2 (/ry)cos 2m 2 (/ry) 


sin 2m (/ry) \ 

cos(/ry) sin 2m_1 (ny) 


^ m cos 2 (/ry)sin 2m 2 (/ry) 


2 m\ . 2m 


2 m 


sin 2m (7ry) 


2m \ 2m 


- , cos 2m (/ry) 

2 m / v *' y 


V 
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We now verify the conditions in Corollary 2.48. Using the binomial formula 


(. x+y) 2m 



(2.64) 


we see via (2.62) that 

X \ H ^t >\ 2 = X f 2 ;) sm 2 l {ny)cos 2 { > 2 m - l \ny) 

£=o ^=o V * / 

= (sin 2 (7ry)+cos 2 (/ry)) 2m 

= 1, yeT. 

Using the binomial formula with v = — 1 ,y = 1, the expressions in (2.62) and (2.63) 
yield 


2m_ 2m 

X^(y)7i/ 2 #H7) = sin 2m (7T 7 )cos 2m (^7) , 

£=o £=o V 

sin 2m (7ry)cos 2m (/ry)(l - l) 2m 

= 0 . 

Now Corollary 2.48 implies that the 2m functions \\f \,..., i// 2 m defined by 

v?(y)=^(y/2)w(y/2) 

/2m\ sin 2m+ ^(7ry/2)cos 2m_ ^(/ry/2) 

\ ^ / (/ry/2) 2m 

generate a tight multiwavelet frame {D- 7 ’7]tV^} > /,ikGZ } f=i,...,2m for L 2 (M). 

We want to study the properties of the frame constructed in Example 2.49, but we 
first change the definition slightly by multiplying each of the functions H£ in (2.62) 
with a complex number of absolute value 1. This modification will not change the 
frame properties for the generated wavelet system. 

Example 2.50. We continue Example 2.49, but now we define 



H e (y) = i^( 2 f)sint(ny) C os> m \ny), £=!,...,2m. (2.65) 

only differs from the choice in (2.62) by a constant of absolute value 1, so the 
functions \\f \,..., i j/2 m given by 


We(2y) =H e (y)w(y), £= 1,...,2 m, 


( 2 . 66 ) 
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also generate a tight multiwavelet frame. Instead of inserting the expression for xj/q 
in (2.66), we now rewrite Hz(y) using Euler’s formula: 


, * (2m\ (e^-e-Wy (e^ + e-W 


2 i 


2 m-l 


= 2 ' 


( e>tir _ e ~ Kiy ) 1 («* <r + e ~ Kiy ) 


— 7 r/'y\ 2m —£ 


(2.67) 


Via the binomial formula we see that 7//<(y) is a finite linear combination of terms: 


—2nimy —2ni(m—\)y 2ni(m—\)y 2 nimy 


All coefficients in the linear combination are real. Writing 

m 

Heir) = £ c k/ e 2 * ik r, 

k=—m 

it follows that 

m 

Vi = V2 Ck/DT_ k ty o. (2.68) 

k=—m 

That is, i//^ is a real-valued spline. Since DT m Vo has support in [0,m] and DT- m Vo 
has support in [—m,0], the spline has support in [—m,m]. Our arguments also 
show that the splines \f/£ inherit other properties from i//o: They have degree 2m — 1, 
belong to C 2m_2 (M), and have knots at Z/2. 

Let us find an explicit expression for the generators in Example 2.50 in the case 
m = 1 : 


Example 2.51. In the case m = 1, the construction in Example 2.50 leads to two 
generators, \j/\ and y/ 2 . Via the expression (2.67) for// 1 , 


ffi(r) 



m ’ r )(e m ’ r + e- jr '’ r ) 


1 


(e 27 ^ 




Via elementary manipulations with the Fourier transform, we conclude that 


¥i( x ) 


V2 


(B 2 (2x+ 1) — B 2 (2x— 1)). 


(2.69) 


See Fig.2.4 Similarly, one proves (Exercise 25) that 

y/ 2 (x) = X - {B 2 {2x + 1) - 2B 2 (2x) +B 2 ( 2x- 1)), (2.70) 

which is shown in Fig. 2.4. 
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(a) (b) 

Fig. 2.4: (a) The function i//i given by (2.69); (b) the function i/a 2 given by (2.70). 


We note that the computational effort in Example 2.50 increases with the order 
of the B-spline B^m we start with: The number of generators Xf/i ,..., \j/ 2 m increases 
with the order of the spline B^m, and (2.68) shows that computation of \f/£ involves 
the calculation of a large number of coefficients for high-order B-splines. 

Wavelet orthonormal bases have traditionally been used for approximation- 
theoretic purposes. In this context it is known that the number of vanishing 
moments plays a key role. Unfortunately, it has been shown that among the B-spline 
frame generators constructed using the unitary extension principle, at least one of 
the generators can have at most one vanishing moment. Using a modification of the 
unitary extension principle, it has been shown by two groups of researchers [52] 
and [61] how one can obtain similar constructions with a high number of vanishing 
moments. Also, one can prove that it is possible to construct multiwavelet frames 
with two generators based on any B-spline #2m»he-, with any prescribed regularity; 
see [52] and [61]. 
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Exercises 

1. Assume that {ek}^ =l is a sequence of normalized vectors in a Hilbert space Jf? 
and that 

i\(f,e k )\ 2 = ll/ll 2 , V/GJT. 

fc=l 

Show that {ek}^ =1 is an orthonormal basis for 

2. Assume that {fk}^=\ is a Bessel sequence with bound B. Prove that 

a - ||A|| 2 <#for allfceN. 

b. If ||A|| 2 = B for some k G N, then A_L/) for all j G N\ {&}. 

3. Let {A}yT=i a sequence in a Hilbert space Prove that 

a. If there exists B > 0 such that 

HX^a|| 2 <*XM 2 

for all finite sequences {cjJ, then xr=i c*/jt converges for all {c*})^ G ^ 2 (N) 
and {fk}k=i i s a Bessel sequence with bound B. 

b. If (2.14) holds for all finite scalar sequences then it holds for all 

c. If {fk)k=\ i s a Riesz basis, then 

X Q/i is convergent {c k } k= \ G £ 2 (N). 

k =1 

4. Two sequences {A}^ =1 and {gk}k=i i n a Hilbert space are biorthogonal if 

(fk,gj) = & j. 

Show that for a pair of dual Riesz bases {fk}°k=i an d {#&}&=l 5 the following 
hold: 

a- {h)U and {§k}k=i are biorthogonal. 
b. For all / G Stf, 

f =Yj{f,gk)fk=Yj(fJk)gk- (2.71) 

k= 1 *=I 


5. Prove Proposition 2.11. 

6. Show that if (2.14) holds for all finite sequences {c^}^ =1 , it automatically holds 
for all {ck} k =\ € ^ 2 (N). 

7. Prove Lemma 2.14. 

8. Find an example of a sequence in a Hilbert space that is a basis but not a frame. 

9. Show that if {fk}°k=\ is a du a 1 frame of a frame {g^}^ =1 , then {#&}&= 1 * s a ^ so a 
dual frame of {fk}°k=\ • 
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10. Prove that the upper and lower frame conditions are unrelated: In an arbitrary 
Hilbert space 9 there exists a sequence {A}*° =1 satisfying the upper condition 
for all / G but not the lower condition; and vice versa. 

11. Let {ek}k =l be an orthonormal basis and consider the family 



Prove that {fk\k=\ * s not a Bessel sequence. 

12. Let {fk}k=i an d {gk}k= l dual f rames for a Hilbert space Jf ? 9 and U : > 

a unitary operator. Show that {Ufk\k=\ an d {Ugk}k= i also form a pair of 
dual frames for Jf?. 

13. Prove Theorem 2.22. 

14. We consider the B-splines N 2 and N 3 . 

a. Show via the definition that the B-spline N 2 is given by 


v, if v G [0,1], 


N 2 {x)={ 2-jc, if v G [1,2], 


0, otherwise. 


b. Use (a) to show that 


N 3 (x) = < 



if v G [0,1], 
if v G [1,2], 
if jc G [2,3], 


otherwise. 


15. Consider the B-spline N nj /iG N. 


a. Show that 



b. Show that 


N n (2y) = e~ ninr (cos 7ty) n N n (y). 


c. Show that the function 


//o(y) := e niny (cos nyf 


is 1-periodic. 


16. Show that the definitions of the centered B-splines B m in (2.30) and (2.31) 
coincide. 
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17. Consider the centered B-spline B n , n G N. 

a. Show that 

T n (2y) = (cos7ty) n T n (y). 

b. Show that the function 

Ho(y) ■■= (cos nyf 
is 1-periodic if and only if n is even. 

18. Prove Corollary 2.31. 

19. Show that for the B-spline , the system cannot be a frame 

for any b > 0. 

20. Show by an example (maybe with a = b = 1) that the necessary condition in 
Proposition 2.28 does not suffice for {E m bT na g} m , ne z being a Gabor frame. 

21. Prove that {E m T na X[o^i]} m is a frame for L 2 (M) for all a G (0,1]. 

22. Prove Corollary 2.33. 

23. Prove Corollary 2.37. 

24. Verify that (2.60) is equivalent to the condition H(y) * H(y) = /, a.e. y. 

25. Derive expression (2.70) for the function 1 // 2 - 

26. Let {D^TkXj/} b e a frame with frame operator S. Prove that S commutes 
with the dilation operator D , and thereby that 


{S l D j T k \\f}j^z = {D j S l T k \i/}j, ke z. 


Chapter 3 

Continuous and Discrete Reproducing Systems 
That Arise from Translations. Theory and 
Applications of Composite Wavelets 


Demetrio Labate and Guido Weiss 


Abstract Reproducing systems of functions such as the wavelet and Gabor systems 
have been particularly successful in a variety of applications from both mathematics 
and engineering. In this chapter, we review a number of recent results in the 
study of such systems and their generalizations developed by the authors and their 
collaborators. We first describe the unified theory of reproducing systems. This is a 
simple and flexible mathematical framework to characterize and analyze wavelets, 
Gabor systems, and other reproducing systems in a unified manner. The systems 
of interest to us are obtained by applying families of translations, modulations, and 
dilations to a countable set of functions. As the reader will see, we can rewrite such 
systems as a countable family of translations applied to a countable collection of 
functions. Building in part on this approach, we define the wavelets with composite 
dilations, a novel class of reproducing systems that provide truly multidimensional 
generalizations of traditional wavelets. For example, in dimension 2, the elements 
of such systems are defined not only at various scales and locations, as traditional 
wavelet systems, but also at various orientations. The shearlet system is a special 
case of a composite wavelet system that provides an optimally sparse representation 
for a large class of bivariate functions. This is useful for a number of applications 
in image processing, such as image denoising and edge detection. Finally, we dis¬ 
cuss some related issues about the continuous wavelet transform and the continuous 
analogues of composite wavelets. 

3.1 Introduction 

These lectures present an overview of a research program developed by the authors 
and their collaborators at Washington University in St. Louis during the past 
10 years, which is devoted to the study of reproducing systems of functions. 
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By reproducing systems of functions, we refer to those families of functions 
{\ffi : i'G/} in L 2 (R n ) that are obtained by applying a countable collection of 
operators to a countable set of “generating” functions and have the property that 
any function / E can be recovered from the reproducing formula 


/=£</, w) w, 

with convergence in the L 2 -norm. The wavelet systems , for example, have received 
a great deal of attention in the last 20 years, since their applications in mathemat¬ 
ics and engineering have been especially successful. In dimension n = 1, they are 
defined as those collections of the form 

•P = Wj,k = 2 jl2 W(2 j ■ -k) :j,ke Z}, (3.1) 

where y/ is a fixed function in L 2 (M). As the above expression shows, ¥ is obtained 
by applying dyadic dilations and integer translations to the generating function y/. 
For particular choices of the generator yf, the wavelet system ¥ is an orthonor¬ 
mal basis or a Parseval frame for L 2 (M), in which case any / E L 2 (M n ) can be 
recovered as 

/= S (f,Vj,k)Yj,k, (3.2) 

j,kC: Z 

with convergence in the L 2 -norm. Other important classes of reproducing systems 
are the Gabor systems, which are obtained by applying translations and modula¬ 
tions to a fixed generator, and the wave packet systems, which involve translations, 
dilations, and modulations. 

One main theme developed in these lectures is that there is a general framework 
that allows us to describe and analyze wavelet systems, Gabor systems, and many 
other reproducing systems by using a unified approach. Indeed, for a large class of 
reproducing systems of the form 

{g P (tTrCpk):keZ n ,pe&>h ( 3 - 3 ) 

where & is countable and {C p } is a set of invertible matrices, there is a relatively 
simple set of equations that characterizes those generating functions {g p } p e&> such 
that the corresponding system (3.3) is an orthonormal basis or, more generally, a 
Parseval frame for L 2 (M n ). For example, it was discovered by Gripenberg [107] and 
Wang [232] independently, in 1995, that a function y/ E L 2 (M) is the generator of 
an orthonormal wavelet system if and only if || i/r || 2 = 1, 

X|y/(2^)| 2 = 1 for a. e. ^ € K, (3.4) 

jeZ 


and 


= I ¥( 2 J M( 2 KZ+q)) = 0 f°ra.e. % e R, 
j> o 


(3.5) 
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whenever q is an odd integer. It is remarkable that a similar set of characterization 
equations holds not only for wavelet systems in higher dimensions, but also for 
many other reproducing systems. This topic, and the corresponding unified theory 
of reproducing systems, will be presented in Section 3.2. 

Parallel to the unified theory mentioned above, there is another “unifying” 
perspective to the study of reproducing systems that is provided by representa¬ 
tion theory and, more specifically, by the study of the continuous wavelet transform 
and its generalization. In Section 3.3, we introduce the continuous analogues of the 
wavelet systems (3.1), which are obtained by applying dilations (with respect to a 
dilation group) and continuous translations to a function i/a e /^(M 22 ). For example, 
in dimension n = 1, the continuous wavelet system is a system of the form 

{y f at = -t )): a > 0, t e M}, 


and the (one-dimensional) continuous wavelet transform is the mapping 

/!->■ | {f, Vat) =a~3 jf f(y)xi/(a~ 1 (y-t))dy: ( a,t ) e K + xm|. 

Then, provided that i/a satisfies a certain admissibility condition, any / E L 2 (M) can 
be expressed using the Calderon reproducing formula: 

f=[[ {f,Vat)Vat—dt. (3.6) 

Jr Jo a 

The close relationship between the discrete and continuous frameworks is apparent 
by comparing the last expression with formula (3.2). A number of observations 
concerning this relationship, as well as several multidimensional extensions of the 
continuous wavelet transform, are discussed in Section 3.3. 

Traditional multidimensional wavelet systems are obtained by taking tensor 
products of one-dimensional ones, as a result, they have a very limited capability 
to deal effectively with those directional features that typically occur in images and 
other multidimensional data. To overcome such limitations, several extensions and 
generalizations have been proposed in applied harmonic analysis during the last 10 
years. One such approach is the theory of wavelets with composite dilations , which 
was originally introduced by the authors and their collaborators and provides a very 
flexible and powerful framework to construct “truly” multidimensional extensions 
of the wavelet approach. 

An example of a composite wavelet system, in dimension n = 2, is the collection 


{Wjk = I detA|'/ 2 yriB'A 1 ■ -k ): ij e Z , * e Z 2 }, (3.7) 

where A = ^ and B = ^ . The elements of such systems are defined not 

only at various scales and locations, as traditional wavelet systems, but also at var¬ 
ious orientations, associated with the powers of the shearing matrix B. In addition, 
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for appropriate choices of y/, the elements 1have the ability to provide very 
efficient representations for data containing directional and anisotropic features 
(see Section 3.5). There is a variety of systems of the form (3.7) forming Parseval 
frames or even orthonormal bases, for many choices of matrices A and B. Indeed, the 
theory of wavelets with composite dilations encompasses the theory of wavelets, and 
there is a generalized multiresolution analysis (MRA) associated with this theory. 
As in the case of the classical MRA, this framework allows one to obtain a variety 
of constructions with many different geometric and analytic properties. An outline 
of this theory is presented in Section 3.4. 

In Section 3.5, we examine a generalization of the wavelet transform associated 
with the affine group 


G = {(M,t) :Me3> a , tel 2 }, 
where, for each 0 < a < 1, C GL 2 (M) is the set of matrices 



Associated with this is the continuous shearlet transform SA®, defined by 


f ->■ = (/,vw) :a>0,seR,re K 2 }, 


which is mapping / G L 2 (M 2 ) into a transform domain dependent on the scale a , 
the shearing parameter 5 , and the location t. The analyzing elements 1 ]/ ast , forming a 
continuous shearlet system , are the functions 


Vast 0) = I det M as | lyiMjix-t)), 


(3.8) 


with M as G One remarkable property is that the continuous shearlet transform 
of a function / has the ability to completely characterize both the location and the 
geometry of the set of singularities of /. 

A discrete shearlet system is obtained by appropriately discretizing the functions 
(3.8). Indeed, such a discrete system can be designed so that it forms a Parseval 
frame and it provides us with a special case of wavelets with composite dilations 
(3.7). In addition, the generator y/ can be chosen to be a well-localized function; 
that is, y/ has fast decay in both the space and frequency domains (see [124, 127]). 
As a result, the elements of the discrete shearlet systsm form a collection of 
well-localized waveforms at various scales, locations, and orientations and pro¬ 
vide optimally sparse representations for a large class of bivariate functions with 
distributed discontinuities. Only the curvelets introduced by Candes and Donoho 
have been proved, to have similar properties; however, the curvelets do not share 
the simple affine-like structure of wavelets with composite dilations. To illustrate 
the advantages of the shearlet framework with respect to wavelets and other 
traditional representations, we describe a number of useful applications of shearlets 
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to the analysis and processing of images, including some representative applications 
of feature extraction and edge detection. 


3.2 Unified Theory of Reproducing Systems 


In order to describe the types of reproducing systems that we will consider in this 
study, it will be useful to introduce the following definitions. We adopt the conven- 

AA 


tion that v G W 1 is a column vector, i.e., v = 


, and that t, £ W 1 is a row vector, 


\ x n / 

i.e., = (<^i,..., t; n ). A vector v multiplying a matrix M G GL n (R) on the right is 

understood to be a column vector, while a vector ^ multiplying M on the left is a 
row vector. Thus, Mx G M" and t;M gR w . 

Let / G L 2 (M w ). For y G W 1 , the translation operator T y is defined by 
T y f{x) = f(x — y); for M G GL n (M), the dilation operator Dm is defined by 
Dm f(x) = | detM| -1 / 2 /(M -1 v); for vgM”, the modulation operator M v is defined 
by (M v /) (x) = e 2 ™*f(x). 

We will use the Fourier transform in the form 


M)= [ f(x)e~ 2 ^dx, 

JR" 

for /Si 1 (R”) nL 2 (M"). Thus, the inverse Fourier transform is given by 

/»= t 

JR n 

We remark that (T y f) A (%) = (M y f)(%) and (£> M /) A (^) = where 

(Du/nt;) = (D M f)(t;) = |detM| 1 / 2 /(i*M). 

Virtually all systems of functions used in harmonic analysis to generate sub¬ 
spaces of L 2 (R n ) are obtained by applying a certain combination of translations, 
dilations, and modulations to a finite family of functions in L 2 (M n ). Let us start by 
recalling the definitions of the systems commonly used in many harmonic analysis 
applications. 

Gabor Systems. Let *¥ = {i/a 1 , ..., y/ L } c L 2 (M n ), and B,C G GL n (R). The Gabor 
systems are the collections 

V = SM'fO = {M Bm T ck x/:m,keZ n ,£=l,...,L} 

0r , 

# = = {TckM Bm «/ : m,k e Z n , £ = 1,... ,L}. 

Notice that is obtained from by interchanging the order of the translation and 
modulation operators. Also, it is easy to see that 


M Bm T ck V = e~ 2niBmCk T ck M Bm x/. 
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Affine Systems. Given W = {i/a 1 , ..., y/ L } c L 2 (M n ), A c GL„(M), and F C W 1 , 
the affine systems are the collections 

& = & A f(Y) = {D a T 7 \/:aEA, y E T, £ = 1, ... ,Lj. 

Very often we use the notation 3 — {A/ 7 : j E Z}, where M E GL n (M) is expanding 
(i.e., each proper value A of M satisfies |A| > 1), and F is the lattice CU 1 , where 
CgGF(M). 

Wave Packet Systems. These include the above two systems. For = {i/a 1 , ..., i/a l }, 
they consist of those functions 

W0>rA,s(V) = {TjD a M y x/: yeT, y E S, £= 

where F,S are countable (or finite) subsets of W 1 , A C GL W (M). As will be discussed 
later, the order of the three operators T r. D a ,M y can be permuted. 

It is easy to see that each of the above systems can be expressed in the following 
form. 

Let & be a countable indexing set, {g p : p E 3*} a family of functions in L 2 (R n ), 
and {C p : p e 3*} a corresponding collection of matrices in GL n (M). Then each of 
the systems we just described has the form 


{T Cp k8 P :keZ n ,pe&>}. (3.9) 

Indeed, in order to write down the general wave packet system into the form (3.9), 
one needs just to use the “commutativity relations” DmT 1 \ = T m ^Dm and M y = 
e 2 n iykj k M y (notice that e 2niyk is a constant of absolute value 1). 


3.2.1 Unified Theorem for Reproducing Systems 

In the theory of wavelets and, more generally, in harmonic analysis, it is of para¬ 
mount importance to construct such systems that form a reproducing set for the 
space L 2 (M n ) (or more general function spaces). For example, it is of particular 
interest to know when a system {(pj : j E Z} of functions in L 2 (M n ) is an orthonormal 
basis or, more generally, a frame. Many characterizations of systems that are 
Parseval frames have been given in the literature; most often these results concern 
affine systems [107, 106, 137, 158, 206, 232]. 

We shall now give necessary and sufficient conditions for the system (3.3) to be 
a Parseval frame for L 2 ( W z ). For simplicity, we are letting the lattice F be our 
arguments below can be easily extended to a more general F. 

Recall that a countable collection {(j>i}iei in a (separable) Hilbert space Jt? is a 
Parseval frame (sometimes called a tight frame with constant 1) for 3? if 

Il(/,^)| 2 = ll/ll 2 for all fEJf. 

i£l 


3 Composite Wavelet Systems 


93 


This is equivalent to the reproducing formula / = £j(/j pi) pi, for all / G Jf?, where 
the series converges unconditionally in the norm of Jf?. This shows that a Parseval 
frame provides a basis-like representation even though a Parseval frame need not be 
a basis in general. We refer the reader to [43, 47] for more details about frames. 

We refer to the following result as the “unifying theorem for reproducing 
systems” [137]: 

Theorem 3.1. Let £? be a countable indexing set , {g p } p ^^> a collection of functions 
in L 2 (M n ), and {C p } pe ^> C GL n (R). Let 


£ = {/ G L 2 (R w ) : / G L°°(R n ) and supp/ is compact }, 
and suppose that 


*(/) = 111 M 

nd 02 SUDD f 


1 \ 12 


+ mC )| 


P e^meZ nJsa PPf " r ldetC„ 


■\g P (t;)\ 2 dt;<< 


(3.10) 


for all f G S’. Then the system (3.3) is a Parseval frame for L 2 (K") if and only if 

1 ih lr \ Sp^)Sp{^ + a) = 8 afi fora.e.^eM. n , (3.11) 
p e& a l aetL pl 


for each a G A = \J pe ^>Z n C p 1 , where £? a = {p G & : aC p G Z n } and 8 is the 
Krone eke r delta for 

Before discussing the proof of this theorem, it will be useful to make a few com¬ 
ments about this result, in order to elucidate its context and impact. 

Remark 3.2. It is relatively well known that if i ff e L 2 (M), then {\f/jk = D 2J : 
j,k G Z} is an orthonormal basis for L 2 (M) (i.e., i/a is an orthonormal wavelet) 
if and only if Eqs. (3.4) and (3.5) hold. As we mentioned above, this result was 
obtained independently by Gripenberg [107] and Wang [232]. As will be discussed, 
these equations are a simple consequence of Theorem 3.1 (see Exercise 1). 

Remark 3.3. Assumption (3.10) is referred to as the local integrability condition 
(LIC). At first sight, it might appear as a rather formidable technical hypothesis. 
In some cases, however, it can be shown that it is a simple consequence of the 
system being considered. For example, let us consider the Gabor system ^b,c{G), 
where G = {g 1 ,... ,g L }, and let us write it in the form (3.3). Namely, let 
= z n x {1,2,... ,L}, g p = gjj: = Msjg 1 . and C p = C, so that 


T Cp kg P = Ta< Mg] g‘ ■ 

Without loss of generality, we can assume that L = 1. Thus, the expression 
of (3.10) is 
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for / G S and K = supp/ is compact. Since <5; G K, only a finite number of terms in 
the sum 'Z m ez n are nonzero. Moreover, if T n is the ft-torus, for each y G Z n , the set 
{B(T n + j — p): p eZ n } is a partition of W 1 . Thus, 

Ml= f , i^)i m-B P )\ 2 d^. 

JUpeZnB(T‘+j-p) peZ nJB(V+j) 

Now observe that a finite union of the sets {B(T n + j) : j G Z”} covers /C Using 
this fact and the fact that ||/||oo < 00 (since / G S’), it is not difficult to show that 

X{f)<A\\g\\l 

where A is a positive constant. As a result, the characterization theorem for the 
Gabor systems can be stated explicitly as follows: 

Theorem 3.4. The system &b,c(G) [or the system %,c(G)] A a Parseval frame for 
L 2 (M n ) if and only if 

X X I J [ r \ -Bk + mC~ l ) = 8 mfi 

£=lkeZ n l aeic l 
for a.e. % G MT, all m G Z n . 

This result is well known and can be found, for example, in [145, 207, 58, 158]. 

The situation for the “usual” affine systems is somewhat more subtle. Here, by 
the word “usual,” we mean the case where A = {a J : j G Z}, where a G GL n (M) 
is expanding, and T = U 1 . In this case, one can show that if conditions (3.11) are 
true, then the LIC is valid and, conversely, if the system (3.3) is a Parseval frame, 
then the LIC also holds. Thus, in the characterization of Parseval frames given by 
Theorem 3.1, it is not needed to assume the LIC. The characterization theorem for 
these systems can be written down explicitly as 

Theorem 3.5. Let W = {\f / 1 ,..., y/ L } c L 2 (M W ) and a G GL n (R) be expanding. Then 
the system = {D a j ^k V*' : i’GZ^GZ”, £ = 1,... ,L} is a Parseval frame 

for L 2 (1R W ) if and only if 

I I i?(Sa-J)p(G+a)a-J) = 8 a ,o, for a.e. % e K", (3.12) 

t=lj€& a 


for all a G A = \Jj e %Z n af where £? a = { y G Z : a a 7 G Z^}. 

Apart from the argument needed to establish the validity of the LIC, which we 
mentioned above, this last theorem is a simple consequence of Theorem 3.1 once 
the system <^A,r(fP) is expressed in the form (3.3). Notice that there is a redun¬ 
dancy in condition (3.12). Indeed, an elementary argument shows that (3.12) can be 
simplified to 

X X <//(£ = 8 mXh for a.e. § e R B , 

i=i je& m 


(3.13) 
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for all m G Z w , where = {j G Z : ma~i G 27*}. It follows easily from this form 
of Theorem 3.5 that the result of Gripenberg and Wang (given in Remark 3.2) holds 
for n = 1 and a = 2. 

In order to present the ideas involved in the proof of Theorem 3.1, it is useful 
to introduce the C-bracket product of /,gG L 2 (M n ), which, for C G GL n (M), is 
defined by 


\f,g\(x\C) = £ f{x — Ck)g{x — Ck). 

kcZ n 

It is clear that [/, g] is CZ n - periodic; that is, [/, g] (x + Cm; C) = [/, g] (x; C) for each 
wGZ". 

That the system (3.3) is a Parseval frame for L 2 (M n ) is equivalent to 

N 2 {f) =11 K/,7b^p)| 2 = ll/lli (3.14) 

p<E^keZ n 

for all / G £ [recall that £ is dense in L 2 (R W )]. 

Using the fact that W 1 = (J i e z n { — /) C _1 } is a disjoint union, it follows easily 

that 


C fjckg)=( ?{%)&{£)<?**** dS 

JR n 

= 1 \„ M-C^WrOOe^d!; 

= [ [fM^e^dt;. 

Jc 1 (T«) 

Under all these assumptions, let us consider the function 

H(x)= £ \(T x f,T C kg)\ 2 , 

k£Z n 

where C G GL n (R). Indeed, it is clear that the function H is CZ n - periodic. Using 
the fact that / has compact support, one can show that 

Lemma 3.6. The function H{x) is the trigonometric polynomial where 
H{x)= X H(m)e 2ni{ - C,m > x , 


where 

H{m) = jU-j- f n f^)f(^+C I m)g^)g^+C I m)d^, 


and only a finite number of these expressions are nonzero. 
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The fact that H{m) ^ 0 for finitely many m at most follows from the fact that / has 
compact support. 

To show that Eq. (3.14) holds for all / G consider now the function 


w(x) = N 2 (T x f) = X Hp(x), 

where H p (x) = 'LkeTr I ( T xf,T Cp kg P )| 2 - By Lemma 3.6, for each p e 


where 


H p (x)= X H P (m)e 2Ki{ - c »\ 

mEZ n 


H P {m) = ^ ^J^mM+C^gp^gpiti+C^dl;. 

Thus, using the assumptions of Theorem 3.1, from the observations we made above, 
we have the expression 


w(x)=N 2 (T x f) = £ h ia)e 2 " ia - x , (3.15) 

aeA 


where 


w(a)= [ mf(l; + a) X -^—g p ^)g p ^ + a)d^. (3.16) 

m n | detCp| 

This integral is absolutely convergent, and the series defining w(x) is absolutely and 
uniformly convergent. Notice that the LIC plays an important role in establishing 
these convergence properties and the various uses of Fubini’s theorem needed for 
the formulas developed here. 

To complete the proof of Theorem 3.1 we argue as follows. Let us assume (3.11). 
Then, by Eq. (3.16), 


w(cc) = 8 afi [ /(£)/(! + a )d^- 

JR n 

By Eq. (3.15), this implies 

w(x) = N 2 (T x f) = X w(a)e Ma - x = w(0 ) = \\f\\ 2 . 

aeA 

Hence, the system (3.3) is a Parseval frame for L 2 (R n ). 

Conversely, let us now assume that the system (3.3) is a Parseval frame for 
L 2 (M^). Hence, by our assumptions, we know that 

N 2 (T x f) = w(x) = X w(a)e 2nia ' x = \\T x f\\ 2 = \\f\g 
aeA 


for all / G S. 
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Since A is countable and the “Fourier coefficients” w(a) of this generalized 
Fourier series are unique, we must have w( a) = 0 if a ^ 0 and w(0) = 1. We can 
then use (3.16) and appropriate choices of / E £ to show that the Eq. (3.11) must 
hold. For example, by letting / be such that /(£) = f E (%) = (1/ \/\B(£)\)Xb(£) 
— ^o)> where B(e) is a ball of radius £ about the origin, £ > 0, and <^o is a point 
of differentiability of the integral of h(^) = ^ pe ^(l/\dQtC p \) \g p (t ;)\ 2 , one obtains 
easily from (3.16) that /i(§o) = 1. This gives (3.11) when a = 0. 

This is, to conclude, the basic idea of the proof of Theorem 3.1. The role played 
by these generalized Fourier series is arrived at naturally; it arises from the impor¬ 
tance of the notion of shift invariance, which is essentially related to the structure of 
these families of reproducing systems. 

Theorem 3.1 has many applications, several of which are described in 
[137, 138]. As mentioned above, they include Gabor, affine, and wave packet 
systems. Theorem 3.1 applies also to the quasi-affine systems. In dimension n = 1, 
these are the systems {\j/jk : j,k E Z} obtained from i \f E L 2 (M) by setting 



2 J '/ 2 Tfr D 2 -j y/, j > 0, 
D 2 -jT k y*, j < 0. 


These systems (as well as their higher-dimensional versions) were introduced by 
Ron and Shen in [205]. They pointed out that, unlike the affine systems, these sys¬ 
tems are shift-invariant. Furthermore, the quasi-affine system {ij/j,k} is a Parseval 
frame if and only if the corresponding affine system {tyj.k} is a Parseval frame. 

Recall that, in higher dimensions, affine and quasi-affine systems are typically 
defined using dilations of the form D m j , where M is an expanding matrix: that is, 
each proper value A of M satisfies |A| > 1. Notice that this condition is equivalent 
to the existence of constants k and y, satisfying 0<k<l<y<°o, such that 


\M J x\ > ky J \x\ 


(3.17) 


when v E W 1 , j E Z, j > 0, and 


| Mix\ < j y j |v| 

K 


(3.18) 


when i E 1", j E Z, j < 0. One remarkable property of Theorem 3.1 is that it 
applies not only to the case of expanding-dilation matrices, but also to a more gen¬ 
eral class of dilations that are expanding on a subspace [137] and are defined as 
follows. 

Definition 3.7. Given M E GL n (R) and a nonzero linear subspace F of R n , we 
say that M is expanding on F if there exists a complementary (not necessarily 
orthogonal) linear subspace E of W 1 with the following properties 1 


1 This is the revised definition from [123]. It turned out that the definition initially proposed in 

[137], with a different condition (iv), was not sufficient to guarantee that the LIC was satisfied. 
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1. W 1 = F + E and F HE = {0}; that is, for any x G W \ there exist unique xp G F 
and xp G E such that x = xp + xp ; 

2. M(F) = F and M(£) = E ; that is, i 7 and E are invariant under M; 

3. conditions (3.17) and (3.18) hold for all xGf; 

4. for any j > 0, there exists k\ = k\ (M) > 0 such that \xp\ < &i |M J x^|. 

It is clear that if a matrix M is expanding, then it is also expanding on a subspace. 
However, there are several examples of matrices that satisfy Definition 3.7 and are 
not expanding. For example, the following matrices are all expanding on a subspace: 

• M— ^, where a G M, \a\ > 1; 

(a 0 0 \ 

• M = I 0 cos# — sin# I , where a G M, \a\ > 1. 

yO sin 0 cos 0 J 

It is shown in [123, 137] that for affine systems where the dilation matrix M is 
expanding on a subspace, according to the definition above, then the LIC is “auto¬ 
matically” satisfied. Hence, Theorem 3.5 applies to this class of affine systems as 
well. 

The examples seem to suggest that Theorem 3.5 applies whenever the dilation 
matrix M has all eigenvalues | A^| > 1 and at least one eigenvalue \h\ \ > 1. However, 
this is not the case. In [123] there is an example of a 3 x 3 dilation matrix having 
eigenvalues X\ = a > 1 and A 2 = A 3 = 1, for which the LIC fails. Indeed, it turns out 
that the information about the eigenvalues of M alone is not sufficient to determine 
the LIC or even the existence of corresponding affine systems. We refer to [144, 215] 
for additional results and observations about this topic. 


3.3 Continuous Wavelet Transform 

The full affine group of motions on 17\ denoted by A w , consists of all pairs 
(M, t) G GLn(M) x W 1 (endowed with the product topology) together with the group 
operation 

(M,t)-(M',t') = + (M')~ 1 t). 

This operation is associated with the action x —► M(x + t) on W 1 . The subgroup 
JE = {(M,t) G A n : M = /, t G W 1 } is clearly a normal subgroup of A n . 

We consider a class of subgroups {G} of A n of the form 

G = {(Mf) G A n :Me@,te IT}, 

where Q) is a closed subgroup of GL n (R). We can identify with the subgroup 
{(Mf) G G : M G t = 0}. Hence, we refer to *3) as the dilation subgroup and 
to JE as the translation subgroup of G. If p is the left Haar measure for 3, then 
dA (M, t) = dp(M) dt is the element of the left Haar measure for G. 
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Let U be the unitary representation of G acting on L^M 77 ) defined by 

v) (■ x ) = | det M\-?\i/(M~ l x-t) := </%,;(*), (3-19) 

for (M,t) € G and \jf € L 2 (W l ). The elements { y/Mj : (M,t) € G} are the continuous 
affine systems with respect to G. The corresponding expression in the frequency 
domain is 

(u m) v) A (^) = \detM\?xj,(l;M)e- 2 * m . 

For a fixed i/a g L 2 (M W ), the wavelet transform associated with G is the mapping 

/-► (Wyf)(M,i) = = |detAf|"2 [ f(y)\\f(M~ l y-t)dy , 

where / G L 2 (M") and (M,a) G G. If there exists a function i/a g /^(M 77 ) such that, 
for all / G /^(M 77 ), the reproducing formula 

/= [ (f,VMt)VMtdX[M,t) (3.20) 

Jg 

holds, then i/a is a continuous wavelet with respect to G. Expression (3.20) is a gen¬ 
eralized version of the Calderon reproducing formula (3.6) presented in section 3.1. 
Notice that Eq. (3.20) is understood in the weak sense (see the proof of Theorem 3.8 
below); the pointwise result is much more subtle. 

The following theorem establishes an admissibility condition for i/a that guaran¬ 
tees that (3.20) is satisfied: 

Theorem 3.8. Equation (3.20) is valid for all f G /^(M 77 ) if and only if for a.e. 
§GR"\{0}, 

MZ)= [ \W^M)\ 2 dll{M) = \. (3.21) 

Proof Suppose that (3.21) is satisfied. Then, by direct computation, we have that 


=u. 

= [ [ \ mvm)e inmt di 

Jg>JR n JR n 

= 1(1 (mjmYm 

= / [ \m\ 2 mM)\ 2 dt;dn(M) 

J@jR n 

= [ |/o%,(<^ 

JR n 

= I All 


\AetM\dt dfl(M) 


\detM\dtj dix{M) 
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This shows that the mapping : L 2 (M n ) —► L 2 (G, A) is an isometry. By polariza¬ 
tion, we then obtain 

(^V/’^V^)l 2 (g) = (/^)l 2 (m«)5 (3.22) 

for all /,^L 2 (r). 

Conversely, suppose that Eq. (3.20) holds in the weak sense [i.e., (3.22) holds]. 
Consider the expression 

f \M)\ 2 A ¥ ^)d^, 

JR n 

with / satisfying |/(£) | 2 = | P (r, £ 0 ) | _1 ^j3(r^ 0 ) (4 )> where P (r, £o) is a ball of radius 
r and center £o, and <^o is a point of differentiability of Then, by reversing the 
chain of equalities above, we obtain that 

IPfoSo)!- 1 f c Ay(^ = l, 

J P( r £ o) 

for all r > 0. By taking lim r _^o+, we conclude that z\^(<^q) — 1. Thus, z\^(<^) — 1 for 

a.e. <5; E MY □ 

Theorem 3.8 can easily be extended to the case where G is not a subgroup of 
GL n (M), but simply a subset of GL n (M). Furthermore, Theorem 3.8 extends to func¬ 
tions on subspaces of L 2 (R W ) of the form 

L 2 (y) v = {fe L 2 (R n ): supp/ c V}. 

The proof of this fact is left as an exercise. 

In the special case of Theorem 3.8 where n— 1 and = {2 J : j E Z}, Eq. (3.21) 
is X/ez | V^ r (2* 7 ’<5) | 2 = 1 for a.e. t, G M (this is the classical Calderon equation), and 
Eq. (3.20) is 



where \f/jj (x) := 2 i/ 2 \ff(2 j x — t), j E Z,r G M. Thus, the classical orthonormal 
wavelet expansion 

/ = X E </> vo*) vo* 

y'GZ k.Q.'Zi 

is a “discretization” of (3.23). This shows, by Eq. (3.4), that an orthonormal wavelet 
(in this classical case) is always a continuous wavelet satisfying property (3.23) for 
all / G L 2 (M). This raises the question of how to “discretize” continuous wavelets 
associated with general dilation groups $}. We refer to [234] for more observations 
about this topic. 

A variant of the affine group A n [and the corresponding affine systems (3.19)] is 
obtained by considering the group G* consisting of all pairs (M,t) E GL n (R) x W 1 
(endowed with the product topology) together with the group operation 


(Af,f) • (Af'/) = (MMV+MY). 
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This operation is associated with the action x —> Mx + 1 on R n . The co-affine systems 
associated with G* are then defined as the elements 

v) M = jdet:= y^x), 

for (M,t) G G* and y/ G L 2 (M n ). The corresponding expression in the frequency 
domain is 

(U {M , t )Y) A (^ = | detM| 1 e~ 2nit . 

The left Haar measure, A*, for G* is easily seen to satisfy 
dV{M,t) = | detM| _1 d/i(M) dt, 

where /I is the left Haar measure for Then the “co-affine” reproducing 
formula is 

/ = / (fiVM,t) VM,t dX*{M,t). (3.24) 

JG 

A straightforward calculation shows that (3.24) holds if and only if y/ satisfies con¬ 
dition (3.21). Thus, y/ is a continuous affine wavelet if and only if it is a continuous 
co-affine wavelet. 

Notice that the situation observed above is different from the discrete case. In 
fact, consider the systems = {ytj^ = 2~^ 2 yf(l~^ • —k) : j,k G Z} and = 
{ySjk = 2~^ 2 y/(2~j(- — k)) : j,k G Z}. A simple calculation shows that 

(VjfrV-i-i) = (V0‘,o> V-i,2Jt-i>- 

This shows that the co-affine systems cannot generate the space L 2 (M) if the corre¬ 
sponding affine system ¥ is an orthonormal basis for L 2 (M). In fact, since 2k — 1 is 
never 0, the affine system ¥ is an orthonormal basis for L 2 (M) (in which case the 
right-hand side of the above expression is zero) if and only if the co-affine system 
¥* has a nonempty orthogonal complement. 


3.3.1 Admissible Groups 


It is not difficult to show that there are dilation groups for which one can find 
no functions y/ satisfying Eq. (3.21). In particular, if is compact, there are no 
associated functions y/ that satisfy this condition. For example, let = SO(2) and 
suppose that there is a function y/ G L 2 (M 2 ) satisfying (3.21). Notice that, in this 
case, using polar coordinates Eq. (3.21) can be expressed as 



= 1 , 
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for a.e. <5; = re 1 ^. Multiplying both sides of the equality by r > 0 and integrating 
with respect to r G [0, ©o), we obtain 



This is clearly a contradiction and, thus, there is no i/a satisfying (3.21). In this 
situation, we say that the group SO(2) is not admissible. That a general compact 
3 C GL n (M) is not admissible is not much harder to prove (see last paragraph in 
this section). 

The observation above leads to the question: What are the groups 3 that are 
admissible? Our result on admissibility involves the notion of £-stabilizer of x G M”, 
which is defined as the set 


^ £ = {MGf: \xM-x\ <£}, 


for each £ > 0. The set 3 X := D® = {M G 3 : xM = x} is called the stabilizer of x. 
The modular function A , on 3, defined by the property 


p(EM) = A(M) p(E) 


for all /i-measurable E <Z 3 and Mg®, also plays an important role in the following 
basic result about admissible dilation groups. 

Theorem 3.9. (a) If 3 is admissible, then A |detM| and the stabilizer of x is 
compact for a.e. x G W 1 . 

(b) If A |detM| and for a.e. xGl” there exists an £ > 0 such that the 
£-stabilizer ofx is compact, then 3 is admissible. 

The proof of Theorem 3.9 is rather involved and can be found in [163]. Even 
though Theorem 3.9 “just fails” to be a characterization of admissibility, still it 
is quite useful for determining the admissibility or nonadmissibility of particular 
groups 3. For example, if 3 is compact, then A = | detM| = 1 and, thus, it cannot 
be admissible. Another example where Theorem 3.9 can be used effectively is the 
case where 3 is a one-parameter group. Namely, let 3 = {M t = e tL : t G M}, where L 
is a real nxn matrix. Then 3 is admissible if and only if trace(L) 0. Indeed, since 
det M t = ^ trace ( L ) and 3 is Abelian, it follows that the modular function, A, is iden¬ 
tically 1. Thus, when trace(L) 7 ^ 0, we have that det M t / 1 = A and 3 is admissible. 

3.3.2 Wave Packet Systems 

In [57], Cordoba and Fefferman introduced “wave packets” as those families of 
functions obtained by applying certain collections of dilations, modulations, and 
translations to the Gaussian function. More generally, we will describe as 


3 Composite Wavelet Systems 


103 


“wave packet systems” any collections of functions that are obtained by applying a 
combination of dilations, modulations, and translations to a finite family of func¬ 
tions in L 2 (M^). For W = {y/ £ : 1 < i < L} C L 2 (M n ), where LG N, and S C 
GL n (R ) x W \ the continuous wave packet system with respect to S that is gener¬ 
ated by *¥ is the collection 

W^ s (^) = {D a M v T y i/ : (A, v) e 5, y e K", 1 < £ < L}, (3.25) 

where M v is the modulation operator defined at the beginning of Section 3.2. 

Let 

G = {U = cD A M v T y : cG C, \c\ = 1, (A,v,y) G GL n (M) xM”x M n }. 

G is a subgroup of the unitary operators on L 2 (M^) that is preserved by the action of 
the mapping U —>U, where Uf = ( Uf ) A . 

In the definition (3.25), we considered the map (A, v,y) —> U^ y ^ = D A M V T y , 
which is a one-to-one mapping from 5xM w into the group G. By changing the order 
of the operators, we can also define the following one-to-one mappings from SxW 1 
into G: 


U (Z,y)= D ^ T y M ^ 

U (Z,y) = 

u lZ,y) =MvDAT y ’ 

ZZ,y) = T y M ^ 

u Z,y) = M ^y DA - 

Hence, we can generate alternate continuous wave packet systems, 
by replacing U^ yy ^ with U^ yy y for 1 < i < 5. The systems and 

are equivalent in the sense that one is a Parseval frame if and only if 
the other one is a Parseval frame (in fact, by the commutativity relations of trans¬ 
lations and modulations, they only differ by a unimodular scalar factor). The same 
is true for (*^) an d #^^( l P). The other systems, on the other hand, have 

substantial differences. 

Each subgroup yy y,i = 0,..., 5, is associated with a continuous wave packet 

system generated by W c L 2 (M n ). We can characterize those for which we have 
Parseval frames: 




(f,u. 


(0 

(A,v,y)y/ 


dA(A, v) dy = 


for all / G L 2 (M^), where A is a measure on S'. Such a characterization is an 
extension of Theorem 3.8 and is given by an analogue of Eq. (3.21). Explicitly, 
we have the result: 
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Theorem 3.10. Let •F = { 1 / : 1 < 7 < L } C L 2 (R"). The systems 
i = 0,...,5, are continuous Parseval frame wave packet systems with respect to 
(S, A),/br L 2 (M"), if and only if 

= 1 for a.e. ^ G R n , 


where 


4 1) (|)=4° ) («) = i / |^A- 1 -v)| 2 JA(A,v); 

£=l Js 

4\S)=l[ |v/(^- 1 -v)| 2 |detA|- 1 JA(A,v); 

4^) = X /|V^((| — v)A _1 )| 2 JA(A,v); 

£= 1 ^ 

4 ) (S)=4 i (S) = tf |v/((^-v)A- 1 )| 2 |detA|- 1 JA(A,v). 

^ = 1 


3.4 Affine Systems with Composite Dilations 

To describe the class of systems that will be considered in this section, it will be 
useful to begin with one example in L 2 (M 2 ). 

Let A = ^ where £ / 0, 5 = (^0 l) anc ^ ^ = : J G Z, k G Z 2 }. 

Then G is a group with group multiplication: 

(B^m) (Bfk) = (B £ +fk + B~ j m). (3.26) 

In particular, we have (if 7 ,/:) -1 = (i? - - 7 , —B^k). The multiplication (3.26) is consis¬ 
tent with the operation that maps x —> if 7 (x + k) of M 2 into M 2 . Let n be the unitary 
representation of G, acting on L 2 (M 2 ), which is defined by 

f(Bj,k)fy x )=f((Bj,k)- l x)=f(B-jx-k) = f J B 7*/)(x), (3.27) 

for / G L 2 (M 2 ). Notice that detif 7 = 1. The observation that 

(D { B T m )(D j B T k ) = (D l B +j T k+B -j m ), 

where £J G Z, k,m G Z 2 , shows how the group operation (3.26) is associated with 
the unitary representation (3.27). 

Let So = {§ = (£i,fe) GM 2 : |^i| < 1} and define 

Vo = L 2 (5 0 ) v = {/ € L 2 (M"): supp/ C 5 0 }. 
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(>r(B'»/) A (5) = (D J B T k f) A ($)=e- 2 * i t Bik MBt), 

and t;B j = (£ 1 , = (£ 1 , ^2 + 1 ), then the action of B j maps the vertical strip 

domain So into itself and, thus, the space Vo is invariant under the action of n{Bfk). 
The same invariance property holds for the vertical strips 

Si = SoA‘ = {<* = (^,£ 2 ) S M 2 : |^i| < 2'}, 

1 G Z, and, as a consequence, the spaces Vi = L 2 (Si) v are also invariant under the 
action of the operators n(Bfk). 

The spaces {Vi}i e z defined above satisfy the basic MRA properties: 

1. Vi C V/+i, i G Z; 

2 . Dl% = Vu 

3- n V = {0}; 

4 . y Vi=z, 2 (R»). 

The complete definition of an MRA includes the assumption that Vo is generated by 
the integer-translates of a 0 G Vo, called the scaling function, and that these translates 
: k G Z 2 } are an orthonormal basis of Vo. In some cases, there is more than 
one scaling function. 

The situation here is a bit different, and the scaling property is replaced by an 
analogous property. Namely, consider Vo = T 2 (So) and let 0 = %j, where / = U 
/ + is the triangle with vertices (0,0), (1,0), (1,1) and J~ is the triangle with 
vertices (0,0), (—1,0), (-1,-1). The sets JBf j G Z, form a partition of So ; that 
is, So = U jezJBf except for the set of points {(0, £ 2 ) : <S ;2 0}, which is, however, 

a set of measure 0. The set J has measure 1 and the collection { e~ 2nik % %j : k G Z 2 } 
is easily seen to be an ON basis of L 2 (/). Since 

( e~ 2nih xA -) :) V W = (T k <t,)(x) = 4>(x-k), 

these last functions form an ON basis of L 2 (/) v . It follows that {D b j T k Q:keZ 2 } 
is an ON basis of L 2 (JB^) y , for each j G Z 2 . Hence, the set 

{^bJ Tk Q : j £ ^ k G Z 2 } = { T k D B j 0 : j G Z, /: G Z 2 } 

is an ON basis of Vo. The sets / + ,/ _ , as well as the other sets used in this construc¬ 
tion, are illustrated in Fig. 3.1. 

Thus, the “complete” definition of the MRA, introduced above, adds to properties 
1-4 the property 

2 Recall that, according to the notation introduced in Section 3.2, in the frequency domain, the 
matrices B J multiply row vectors on the right. 
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Fig. 3.1: Example of ON AB-MRA. The sets {/ + Bf J B ; : j G Z} form a disjoint partition of Sq. 


5. Vb is generated by a “scaling function” </), in the sense that {TkD B j </) : j G Z, 

& G Z 2 } is an ON basis of Vo. 

Let Gb be the group {B-i : j G Z}; this is equivalent to the dilation group 
{£>#/ : j G Z}. Then G = {(Z?- 7 ,/:) : j G Z, k G Z 2 } is the semidirect product of 
Gb and Z 2 , denoted by Gb ix Z 2 . This shows that the shift invariance of the tradi¬ 
tional MRA is replaced by a notion of Gb ix Z n invariance , that is, the space Vo is 
invariant with respect to both integer translations and Gb dilations. 

We shall now show how the MRA we just introduced can be used to con¬ 
struct a wavelet-like basis of L 2 (M 2 ). We begin by constructing an ON basis of Wo, 
defined to be the orthogonal complement of Vo in V\, that is, Vi = Vo 0 Wo. It will 
be convenient to work in the frequency domain. We have that V\ = Vo 0 Wo and, 
consequently, Wo = L 2 (Ro), where Ro = S\\So = {£ = (<^i, £&) G M 2 : 1 < |^i| < 2}. 
We define the following subsets of Rq = S\\Sq\ 

h = It U/f, h = It u/ 2 -, h = It U / 3 “, 


/+ = = 1 <^< 2 , 0 <^< 1 |, 


where 
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/+ = j^ = (^ 2 )el 2 : 1<$,<2, !<&<l}, 

/ 3 + = {^ = (^,fe)Gi 2 : l<«i<2, 1<&<&}, 

and IJ = {% E M 2 : —E //}, ^ = 1,2,3. These sets are illustrated in Fig. 3.1 
and 3.2. Observe that each set is a fundamental domain for Z 2 : The functions 
{e 27n ^ : k E Z 2 }, restricted to /^, form an ON basis for L 2 (/^), £ = 1,2,3. We then 
define 1//, ^ = 1,2,3, by setting \j/ = Xi £ , ^ = 1,2,3. It follows from the observations 
about the sets {/^} that the collection 

{e 2 n * k \/^):k£ Z 2 } 



Fig. 3.2: Example of an orthonormal A5 wavelet. 
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is an orthonormal basis of L 2 (/^), i = 1,2,3. A simple direct calculation shows that 
the sets {f\ j G = 1,2,3} are a partition of Ro , that is, 

3 

U U hB> = /? 0) 

£=ljez 

where the union is disjoint. As a consequence, the collection 

{e 2K ^ k xj/(^B J ): keZ 2 , jeZ, 1= 1,2,3} (3.28) 

is an orthonormal basis of L 2 (/?o) and, thus, by taking the inverse Fourier transform 
of (3.28), we have that 

{ji(B j ,k)i/: ke Z 2 , je Z, t = 1,2,3} (3.29) 

is an orthonormal basis of Wo = L 2 (Ro) J . Notice that, since, for each j e Z fixed, 
B' maps Z 2 into itself, the collection {e lKl ^ B ' k : k e Z 2 } is equal to the collection 
{e 2 *' - ** : k e Z 2 }. 

It is clear that, by applying the dilations D A i, i G Z, to the system (3.29), we 
obtain an ON basis of L 2 (/? ; ) v , where 

=*0A'' = {^ = (&,&) e I 2 : 2 ! < |^| < 2 1+! }. 

Furthermore, we have that = ^ 2 ’ where the union is disjoint, and hence 

we can write L 2 (M 2 ) = © ieZ W/. Hence, by combining the observations above, it 
follows that the collection 

{ D A iD Bj T k i/: ^GZ 2 , / JgZ, £=1,2,3} (3.30) 

is an ON basis of L 2 (M 2 ). 


3.4.1 Affine System with Composite Dilations 

The construction given above is a particular example of a general class of affine-like 
systems called affine systems with composite dilations , which have the form 


^ABi'P) = {D A D B T k xi/ : A€G a ,BgG b , keZ n , £=l,...,L}, (3.31) 

where C {yr 1 ,...,y/} G L 2 (W), G\ c GL n (W) (usually, G A = {A 1 : i £ Z}, 
with A expanding or having some “expanding” property), and Gb C GL n (M) with 
| det5| = 1. Later on, we will show that there are several examples of such systems 
that form ON bases of L 2 (R n ) or, more generally, Parseval frames of L 2 (M W ). 

The roles played by the two families of dilations, Ga and Gb , in definition (3.31), 
are very different. The elements A G Ga dilate (at least in some direction), while 
the elements of Gb affect the geometry of the reproducing system *2^5 (^)- I n the 
example we worked out, Gb = {(q l) 2 : J G Z} is the shear group and exhibits 
a “shear geometry,” in which objects in the plane are stretched vertically without 
increasing their size (like the trapezoids in Fig. 3.2). In Section 3.5, we will use this 
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group and a construction similar to the one above to obtain the shearlets , whose 
geometrical properties are similar to the example above and are, in addition, well- 
localized functions (i.e., they have rapid decay in both the space and frequency 
domains). They have similarities to the curvelets introduced by Candes and Donoho 
[38] and to the contourlets of Do and Vetterli [68]. However, their mathematical 
construction is simpler, since it derived from the structure of affine systems, and, as 
a result, their development and applications are “more systematic” [129, 130]. 

As indicated by the example above, there is a special multiresolution analysis 
associated with the affine systems with composite dilations that is useful for 
constructing “composite wavelets.” Let us give a proper definition of this new frame¬ 
work. Let Gb be a countable subset of SL n (Z) = {5G GL n (R) : |det/?| = 1} and 
Ga = {A 1 : i £ Z}, where A £ GL n (Z) (notice that A is an integral matrix). Also, 
assume that A normalizes Gb , that is, ABA~ l £ Gb for every B £ Gb , and that the 
quotient space B/(ABA~ l ) is finite. Then the sequence {Vi}i e % of closed subspaces 
of L 2 (M n ) is an AB-multiresolution analysis (AB-MRA) if the following hold: 

1. Db Tk Vo = Vo, for any B £ Gb, k eZ n 

2. For each i £ Z, Vi C V /+\, where Vi = Df l Vo 

3. n^l,= {0}^Jlj^ = L 2 (M' z ) 

4. There exists 0 £ L 2 (M n ) such that Ob = {D#7£0 : B £ G#, k £ Z n } is a semi- 
orthogonal Parseval frame for Vo, that is, Ob is a Parseval frame for Vo and, in 
addition, Db 0 -LD B r Tp 0 for any B B', B,B' £ Gb, k, k' £ Z n . 

The space Vo is called an AB scaling space and the function 0 is an AB scaling 
function for Vo. In addition, if Ob is an orthonormal basis for Vo, then 0 is an ortho¬ 
normal AB scaling function. 

The number of generators L of an orthonormal MRA AB- wavelet is completely 
determined by the group G = {{Bfk) : j £ Z, k £ Z n }. Indeed, we have the follow¬ 
ing simple fact: 

Proposition 3.11. Let G be a countable group and u —» T u be a unitary represen¬ 
tation of G acting on a (separable) Hilbert space Jtjf. Suppose O = {0 1 , ..., 0^}, 
W = {i/a 1 , ..., y/ M } c Jt?, where N,M £ NU{°°}- If {T u 0^ : w £ G, 1 < k < N} and 
{T u G, 1 <i< M} are each orthonormal bases for Jff, then N = M. 

Proof It follows from the assumptions that, for each 1 < k < N 

M 

II0T=XIK/,W>I 2 . 

u<EGi= 1 

Thus, by the properties of T u , we have 

N N M 

N=lH k \\ 2 =lll\(t k ,T uV ‘y 

k= 1 k=luEGi=l 

M N 

=xxxi<wy>i 2 

i= 1 uEGk= 1 
M 

=xiivn 2 =m. □ 

i= 1 
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Using Proposition 3.11, one obtains the following result, which establishes the 
number of generators needed to obtain an orthonormal MRA AB- wavelet. 

Theorem 3.12. Let ¥ = {i/a 1 , , \j/ L } be an orthonormal MRA AB-multiwavelet 

for L 2 (M n ), and let N = \B/ABA~ l | (= the order of the quotient group B/ABA~ l ). 
Assume that | detA| E N. Then L = N\ detA| — 1. 

The composite wavelet system s^ab^) has associated continuous multi wavelets. 
The simplest case is the one in which the translations are {T y : y E W 1 }. In this case, 
we have the reproducing formula corresponding to (3.20): 



for / E L 2 (R n ). As in Section 3.3, one can show that ¥ = {i/a 1 , ..., y/} satisfies 
(3.32) if and only if it satisfies the Calderon equation 

L 

X X \¥^B j )| = 1 for a.e. % e M". 

£=li,je Z 

Some more general examples of continuous composite wavelet systems will be 
examined in Section 3.5. 


3.4.2 Other Examples 

There are several other examples of affine systems with composite dilations ^4n(*P) 
that form ON bases or Parseval frames. 

In particular, the construction presented above in dimension n = 2 extends to the 
general ^-dimensional setting. In this case, the shear group is given by Gb = {B l : 
i E Z}, where B E GL n (R) is characterized by the equality (B — I n ) 2 = 0, and I n is 
the n x n identity matrix. We refer to [130] for more details about these systems. 

A different type of affine system with composite dilations arises when Gb is a 
finite group. For example, let Gb = {±Bo,±Bi,±B 2 ,±Bi} be the eight-element 

group consisting of the isometries of the square [ — 1,1 ] 2 . Specifically: Z?o = 

B\ = ^ ^, Z? 2 — ^ ^ jj, ^3 = 0^ l)' ^ et ^ P ara ^ e ^°S ram with 

vertices (0,0), (1,0), (2,1), and (1,1) and So = UbeBUb (see the snowflake region 
in Fig. 3.3). It is easy to verify that So is ^-invariant. 

Let A be the quincunx matrix ^ ^ , and Si = SoA*, i G Observe that A 

is expanding, ABA~ l = B , and So C So A = Si. In particular, the region Si \So is 
the disjoint union \J beB RB , where the region R is the parallelogram illustrated in 
Fig. 3.3. Thus, as in the case of the shear composite wavelet that we described 
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Fig. 3.3: Example of composite wavelet with finite group. Ga = {A 1 : i E Z}, where A is the 
quincunx matrix, and Gb is the group of isometries of the square [ — 1,1 ] 2 . 


above, it follows that the system 

\D*a Db Tfeiff ! i E Z, B E Gtj, & E Z 2 }, (3.33) 


where i// = ^, is an orthonormal basis for L 2 (M 2 ). 

If the quincunx matrix A is replaced by the matrix A 



we obtain a 


different ON basis. Let £/, and Si, i E Z, be defined as above. Also, in this case, 
A is expanding, ABA~ l = B, and S\ = D So. A direct computation shows that 
the region S\ \ So is the disjoint union U BeG B where R = R\ |J^2 U^3 and the 
regions 7 ?i, 7?2,^3 are illustrated in Fig. 3.4. Observe that each of the regions 
7 ?i, 7?2,^3 is a fundamental domain. Thus, the system 


{D\D B T k y/: ieZ,BeG B , k £ Z 2 , £=1,...,3}, (3.34) 

where \j/ e = Xu, • 1' = 1,2,3, is an orthonormal basis for L 2 (M 2 ). 

Note that the system in the first example [Eq. (3.33)] was generated by a single 
function, while the second system [Eq. (3.34)] is generated by three functions 
i/r 1 , i/a 2 , \j/ 3 . This is consistent with Theorem 3.12. In fact, if B is a finite group, 
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Fig. 3.4: Example of composite wavelet with finite group. Ga = {A 1 : i 6 Z}, where A = 21, and 
Gb is the group of isometries of the square [ — 1,1 ] 2 . 

then N = \B/ABA~ l \ = 1, and so, in this situation, the number of generators is 
L = | detA| — 1. Thus, by Theorem 3.12, in the first example, we obtain that the 
number of generators is L = 1 since A is the quincunx matrix and det A = 2. In 
the second example, the number of generators is L = 3 since A = 21 and det A = 4. 
Finally, in the example at beginning of this section, where Gb is the two-dimensional 

group of shear matrices and Ga = {A 1 : i E Z}, with A = ^ GL 2 (Z ), a 

calculation shows that \B/ABA~ l \ = 2|A2,21 —1 and, thus, the number of generators 
isL = 2|A 2 ,2r 1 2|A 2 ,2|-l = 3. 

In higher dimensions, the type of constructions we have just described extends 
by using the Coxeter groups. These are finite groups (hence, their elements have 
determinant 1 in magnitude) generated by reflections through hyperplanes. 

Other examples of composite wavelets, in dimension n = 2, are obtained, for 
each A > 1 fixed, by considering the group 
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and choosing Ga to be a group of expanding matrices; for example, 
Ga = {A 1 : i G Z}, where A is diagonal and | detA| > 1. We refer to [130] for more 
details about this construction. 

All examples of composite wavelets presented so far are “direct” constructions 
in the frequency domain. Let us now discuss a different class of composite wavelets 
in the “time domain.” 

Perhaps the simplest dyadic-dilation wavelet in dimension n = 1 is the Haar 
wavelet. It is produced by the scaling function 0 = X[o,i) an d is generated by the 
Haar function y/ = i ^ ^ i The Haar ON basis of L 2 (M) is the affine system 


{ Vi,k = D 2 , T k y/: i,k€ Z}. 


It is a natural question to ask what the extensions are of this compactly supported 
wavelet y/ in higher dimensions. For example, in dimension n = 2, consider the 

quincunx matrix A q = ^ j ^ and the associated affine system 

{ Wifi = Da^ Tk y/ : i £ Z, k £ Z 2 }. (3.35) 

Then, similarly to the one-dimensional Haar wavelet, one can find an MRA wavelet 
y/ produced by a scaling function (/) that is the characteristic function of a compact 
set Q C M 2 of area 1. However, the functions </) and y/ are not that simple. In fact, 
the scaling function </) is the characteristic function of a rather complicated fractal 
set known at the “twin dragon” and y/ is the difference of two similar characteristic 
functions (see Fig. 3.5). 



Fig. 3.5: (a) The fractal set known as “twin dragon.” (b) Support of the two-dimensional Haar 
wavelet y/; y/ = 1 on the darker set, y/ = — 1 on the lighter set. 


We can construct an affine system with composite dilations having the same 
expanding dilation group Ga = {A l q : i G Z} and the same translations that does, 
however, generate a very simple Haar-type wavelet. For the group Gg, let us again 
choose the group of symmetries of the unit square given at the beginning of this 
section. Let Rq be the triangle with vertices (0,0), (1/2,0), (1/2,1/2) and 
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Re = BeRo , £ = 1,.. * ,7 (see Fig. 3.6). Then, for 0 = 2\/2%r q , it follows that the 
system 

{Db ( Ticty : £ = 0,...,7, keZ 2 } 

is an ON basis for the space Vo, which is the closed linear span of the subspace 
of L 2 (M 2 ) consisting of the functions that are constant on each Z 2 -translate of the 
triangles Re, £ = 0,1,... ,7. Let us now consider the spaces Vi = D A -i Vo, i G Z. 
Then one can verify that each space Vi is the closed linear span of the subspace of 
L 2 (M 2 ) consisting of the functions that are constant on each A ~*Z 2 -translate of the 
triangles A~ l Re, £ = 0,1,... ,7. Thus, V\ C V*+i for each i G Z, and the spaces {Vi} 
form an AB-MRA, with 0 as an AiLscaling function. We can now construct a simple 
Haar-like wavelet obtained from this AB- MRA. Specifically, let 


Rq =A~ l R\ U 




= A q l Ri U A q 1 





Fig. 3.6: Example of a composite wavelet with finite support. 


Thus, = X A -' Rx +Z a -i^ 6+ i( i y or, equivalently, 

^°\x) = ^ l \A q x) + ^ 6 \A q x -(?)), (3.36) 

where ^ = Db$, for £ = 0,1,... ,7. It is now easy to see that i/r = (A q x) — 
(j)( 6 \A q x— (j)) is the desired Haar-like AB- wavelet. The space Vo is generated 
by applying the translations 7^, k G Z 2 , to the scaling functions (j)^ = Dr$, 
£ = 0,1,..., 7. We see that this is the case by applying Dr £ in Eq. (3.36); we obtain 

</>(°) = ^V 9 4 + 0 (6) (V-(?)), 
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^=^ 2 \A q x) + ^ 5 \A q x -(?)), 
<t>W = <l>W(A q x) + <l>( 0 \A q x-(<>)), 
^ =^ 4 \A q x) + (j) ( - 7 \A q x-( 0 l )), 
</> (4) = (t> {5) (A q x) + </> ( 2 ) (A ? x- (?)), 
0( 5 )=^( 6 )(A 9 x) + ^W(A 9 x-(?)), 
0( 6 )=^( 7 )(A 9 x) + ^( 4 )(A 9 x-(?)), 

</>( 7 )=^V 9 *) + 4> (3 W-(i))- 


It follows that 


{Z) A , D Be T k xi/: ie Z, £ = 0,1,..., 7, leZ 2 } 

is an ON basis for L 2 (K 2 ). This Haar-type Afi-wavelet is clearly simpler that the 
twin dragon wavelet obtained earlier. We refer to [23, 154] for more information 
about this type of construction. 

Other complicated fractal wavelets appear in many situations. For example, if 
the dilation matrix A q in the affine system (3.35) is replaced by A q \ = 

, then also in this case there is a compactly supported 

MRA wavelet generated by a (compactly supported) scaling function </) that is the 
characteristic function of a fractal set (see Fig. 3.7). 

The construction given above suggests that also in these cases one should be able 
to find an AB -MRA such that the associated compactly supported AB- wavelet has a 
simpler “nonfractal” support. This is done in [154]. 


-MX,-# 2 ) 




Fig. 3.7: The fractal sets associated with the MRA generated by the dilation matrices (a) A q \ 
and (b) A q2 . 
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3.5 Continuous Shearlet Transform 

An important class of subgroups of the affine group A 2 (which was described in 
Section 3.3) is obtained by considering 


G = {{M,t ) :Me@ a , t gK 2 }, 
where, for each 0 < a < 1, 3> a C GL 2 (M) is the set of matrices 


(3.37) 


@ a ={M = M a > 


a > 0, s G ] 


v 0 a a 

The matrices M as can be factorized as M as = B s A a , where 

m::). m:;)- 


(3.38) 


The matrix B s is called a shear matrix and, for each s G M, is a nonexpanding matrix 
(det # 5 = 1 for each s ). The matrix is an anisotropic dilation matrix, that is, the 
dilation rate is different in the x and y directions. In particular, if a = 1/2, the 

matrix A a produces parabolic scaling since f(A a x ) = / [a u (^ ) ) ^ eaves invariant 

the parabola x\ = x 2 . Thus, the action associated with the dilation group S> a can be 
interpreted as the superposition of anisotropic dilation and shear transformations. 

Using Theorem 3.8 from Section 3.3, we can establish simple conditions on 
the function yt so that it will satisfy the Calderon reproducing formula (3.20) with 
respect to G. This is done in the following proposition. 

Proposition 3.13. Let G be given by (3.37) and, for <5; = (£ 1 , £ 2 ) G M 2 , <S ;2 7 ^ 0, let y/ 
be given by 

= V ldi)fc(f^). 

Suppose that 

1. yt] G L 2 (M) satisfies 


r°° dn 

/ lvM«£)| 2 ^ = 1 fora.e. 

7 o a 


2 • IIv^||l2 = 1. 

yt satisfies (3.20) and, hence, is a continuous wavelet with respect to G. 

Proof. A direct computation shows that (^ 1 ,^ 2 )^ = (at ) i,a a (t ) 2 — ^ 1 )). Also, 
notice the element of the left Haar measure for is dp(M as ) = (da/\ detM as \) ds. 
Hence, the admissibility condition (3.21) for y/ is 


Mv)(Z)= [ j WMi)\ 2 V 2 (a 01 Tf-s)) 

JM JM+ ' ' ' ' 


2 da 


7 1+ OJ 


ds= 1 


(3.39) 
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for a.e. (^ 1 ,^ 2 ) C M 2 . Thus, by Theorem 3.8, to show that y/ is a continuous wavelet 
with respect to G, it is sufficient to show that (3.39) is satisfied. Using the assumption 
on \f/\ and 1 // 2 , we have 


2 da 


A(v)(Z) = J R j R+ \Vi(a^)\ 2 \vi (fl“ 1 (|-s)) 


ds 


da 

a 2cc 


da 

rj2 Ot 


for a.e. t, = (<^1 , ^ 2 ) C M 2 . This shows that Eq. (3.39) is satisfied. □ 

In the following, to distinguish a continuous wavelet y/ associated with this 
particular group G from other continuous wavelets, we will refer to such a func¬ 
tion as a continuous shearlet. Hence, for each 0 < a < 1, the continuous shearlet 
transform is the mapping 

/ ^ {^yf(a,s,t) = (/,yw) : a > 0, s eR, t e R 2 }, 


where the analyzing elements, 


Wast(x) = |detM as | 2 \jf{M as l (x-t )): a > 0, s e R, t G R 2 }, 


with M as G form a continuous shearlet system. Notice that, according to the 
terminology introduced in Section 3.3, the elements {y/ as t} are co-affine functions. 

A useful variant of the continuous shearlet transform is obtained by restricting 
the range of the shear variable s associated with the shearing matrices B s to a finite 
interval. Namely, for 0 < a < 1, let us redefine 


®<*> = 


1 3 3 

0 < a < —- <s< 

~ 4 2 ~ ~ 2 


and 

G (/l) = {(M,t ): M e 2> ( a \ t e R 2 }. 

Also, consider the subspace of L 2 (M 2 ) given by L 2 (C/j) v = {/ G L 2 (M 2 ) : 
supp/ C Ch}, where Q is the “horizontal cone” in the frequency plane: 


Q — i,&) C M 2 : |<^i| > 1 and ||^-| < 1 j. 


Hence, we can show that by slightly modifying the assumptions of Proposition 3.13, 
the function y/ is a continuous shearlet for the subspace L 2 (C&) V . 

Proposition 3.14. For t; = (£i, £2) G M 2 £2 7^ 0. to yf be given by 


v(£) = = vi(^i) V2 (|), 
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where 

1. \j/\ E L 2 (M) satisfies 

C°° rtn 

/ \Vi( a Z)\ 2 = 1 fora.e. £ G R, 

Jo a za 

and supp \j/\ C [— 2 , — 1 / 2 ] U [ 1 / 2 , 2 ]; 

2 . || V^Hl 2 = 1 an d supply C [ — 1 , 1 ]. 

Then y/ satisfies (3.24). That is, for all f E L 2 (C/J V , 

/(x)= L I-\ Jo ^ Vas> w 

with convergence in the L 2 sense. 

There are several examples of functions yf\ and 1//2 satisfying the assumptions 
of Propositions 3.13 and 3.14. In particular, we can choose i/q, y /2 such that yt\ , 
V 2 E C^, and we will make this assumption in the following. We refer to [122, 130] 
for the construction of these functions. 

If the assumptions of Proposition 3.14 are satisfied, we say that the set 

^ = {Vast ■■ 0 < a < l -§<s<§, (6K 2 } 

is a continuous shearlet system for L 2 (C^) V and that the corresponding mapping 
from / E L 2 (C/j) v into f(a,s,t) = (/, \f/ as t) is the continuous shearlet trans¬ 

form on L 2 (C/j) v . 

In the frequency domain, an element of the shearlet system 1 if ast has the form 

xjfi(a^i)\jf 2 (a a ~ l (§-■«)) e~ 2Ki ^. 

As a result, each function \j/ ast has support: 

suppi^C {(| 1 , & ):| ie [-a,-^]u[^,|],||-,|<a 1 -“}. 

As illustrated in Fig. 3.8, the frequency support is a pair of trapezoids, symmetric 
with respect to the origin, oriented along a line of slope s. The support becomes 
increasingly elongated as a —> 0 . 

As shown by Proposition 3.14, the continuous shearlet transform provides 

a reproducing formula only for functions in a proper subspace of L 2 (M 2 ). To extend 
the transform to all / E L 2 (M 2 ), we introduce a similar transform to deal with the 
functions supported on the “vertical cone”: 
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Fig. 3.8: Frequency support of the (a) horizontal shearlets and (b) vertical shearlets for different 
values of a and s. 


Specifically, let 

^ (v) ($) = ^ v) (&,&) = vM&)v 2 (f), 

where \j/ 1 , i />2 satisfy the same assumptions as in Proposition 3.14, and consider the 
dilation group 

r Sa ] = [n os = °) : 0 < a < 1 , -§ < s < §, t e M 2 }. 

Then it is easy to verify that the set 

^ (v) = {v^ : 0 < a < i, < i < §, t € M 2 }, 

where ^ = |detV <M |" 1 / 2 v/W(V- 1 (x-r)), is a continuous shearlet system for 
L 2 (C^) v . The corresponding transform f(a,s,t) = {f.YaJt) is the contin¬ 

uous shearlet transform on L 2 (C^) V . Finally, by introducing an appropriate win¬ 
dow function W , we can represent the functions with frequency support on the set 

[ — 2 , 2] 2 as 

/ = ( A fW)w t dt , 

where =W(x — t). As a result, any function / E L 2 (M 2 ) can be reproduced 
with respect of the full shearlet system, which consists of the horizontal shearlet 
system 'f'W, the vertical shearlet system X F^\ and the collection of coarse-scale 
isotropic functions {W t : t E M 2 }. We refer to [156] for more details about this 
representation. For our purposes, it is only the behavior of the fine-scale shear- 
lets that matters. Indeed, in the following, we will apply the continuous shearlet 
transforms and at fine scales (<a —► 0), to resolve and precisely 

describe the boundaries of certain planar regions. Hence, it will be convenient 
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to redefine the shearlet transform, at “fine scales,” as follows. For 0 < a < 1/4, 
s G M, t G M 2 , the (fine-scale) continuous shearlet transform is the mapping from 
/ G L 2 (M 2 \ [ — 2,2] 2 ) v into y^f, which is defined by 


y®f(a,s,t) 


y^ h) ' a (a,s,t), if \s\ < 1, 
if |*y| > 1. 


3.5.1 Edge Analysis Using the Shearlet Transform 

One remarkable property of the continuous shearlet transform is its ability to provide 
a very precise characterization of the set of singularities of functions and distribu¬ 
tions. Indeed, let / be a function on M 2 consisting of several smooth regions Q n , 
n= 1,... ,N, separated by piecewise smooth boundaries y n = dQ n \ 

N 

/M = X fn(x)Xa„(x), 
n= 1 

where each function f n is smooth. Then the continuous shearlet transform 
^yf( a ,s,t) will signal both the location and orientation of the boundaries through 
its asymptotic decay at fine scales. In fact, y®f(a,s,t) will exhibit fast asymptotic 
decay a —> 0 for all (s,t), except for the values of t on the boundary curves y n and 
for the values of s associated with the normal orientation to the y n at t. 

The study of these objects is motivated by image applications, where / is used 
to model an image, and the curves y n are the edges of the image /. We will show 
that the shearlet framework provides a very effective method for the detection and 
analysis of edges. This is a fundamental problem in many applications from com¬ 
puter vision to image processing. 

To illustrate how the shearlet transform can be employed to characterize the 
geometry of edges, let us consider the case where / is simply the characteristic 

function of a bounded subset of M 2 . Also, to simplify the presentation, we will only 

1 /2 

present the situation where a = 1/2 and use the simplified notation y^ = y^ . In 
more general case where a G (0,1), the continuous shearlet transform y^ is similar 
and details can be found in [127]. 

We then have the following result from [127]. 

Theorem 3.15. Let D C M 2 he a hounded region in M 2 , and suppose that the 
boundary curve y = dD is a simple C 3 regular curve. Denote B = %d- Ift = to€y, 
and so = tan Go, where Go is the angle corresponding to the normal orientation to y 
at to, then 

_ 3 

lim a 4 yyB(a,sofo) 7 ^ 0 . 

0+ 

Ift = toE y and s tan Go, or ift £ y, then 

lim a~P y X f / B(a , s,t) = 0 for all > 0 . 

0+ 
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This shows that the continuous shearlet transform y^B^a^sf) has “slow” decay 
only for t = toon y when the value of the shear variable s corresponds exactly to the 
normal orientation to 7 at to. For all other values of t and s, the decay is fast. This 
behavior is illustrated in Fig. 3.9. 



0(ai) 


0(a N ) 


0(a N ) 


Fig. 3.9: Asymptotic decay of the continuous shearlet transform of the B(x) = %d(x) • On the 
boundary dD, for normal orientation, the shearlet transform decays as 0{ar > / A ). For all other values 
of (t,s), the decay is as fast as 0{a N ), for any N G N. 

Theorem 3.15 can be generalized to the situation where the boundary curve 7 
is piecewise smooth and contains finitely many corner points. Also, in this case, 
the continuous shearlet transform provides a precise description of the geometry of 
the boundary curve through its asymptotic decay at fine scales. In particular, at the 
comer points, the asymptotic decay at fine scales is the slowest for values of s cor¬ 
responding to the normal directions (notice that there are two of them). We refer the 
interested reader to [128] for a detailed discussion of the shearlet analysis of regions 
with piecewise smooth boundaries. We also refer to [127, 156] for other related 
results, including the situation where / is not simply the union of characteristic 
functions of sets. 

Finally, we recall that the shearlet transform shares some of the features described 
above with the continuous curvelet transform , another directional multiscale trans¬ 
form introduced by Candes and Donoho in [41]. Even if a result like Theorem 3.15 is 
not known for the curvelet transform, other results in [41] indicate that the curvelet 
transform is also able to capture the geometry of singularities in M 2 through its 
asymptotic decay at fine scales. Notice that, unlike the shearlet transform, the 
curvelet transform is not directly associated with an affine group. 


3.5.2 A Shearlet Approach to Edge Analysis and Detection 

Taking advantage of the properties of the continuous shearlet transform described 
above, an efficient numerical algorithm for edge detection was designed by one 
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of the authors and his collaborators [82, 83]. The shearlet approach adapts several 
ideas from the well-known wavelet modulus maxima method of Hwang, Mallat, 
and Zhong [176, 177], where the edge points of an image / are identified as the 
locations corresponding to the local maxima of the magnitude of the continuous 
wavelet transform of /. Recall that, at a single scale, this wavelet-based method is 
indeed equivalent to the canny edge detector , which is a standard edge detection 
algorithm [62]. 

As shown above, one main feature of the continuous shearlet transform is its 
superior directional selectivity with respect to wavelets and other traditional meth¬ 
ods. This property plays a very important role in the design of the edge detection 
algorithm. In fact, one major task in edge detection is to accurately identify the 
edges of an image in the presence of noise, to perform this task, both the location 
and the orientation of edge points have to be estimated from a noisy image. 

In the usual wavelet modulus maxima approach, the edge orientation of an image 
/, at the location t, is estimated by looking at the ratio of the vertical over the hori¬ 
zontal components of Wyf(a, t ), the wavelet transform of /. However, this approach 
is not very accurate when dealing with discrete data. The advantage of the contin¬ 
uous shearlet transform is that, by representing the image as a function of scale, 
location, and orientation, the directional information is directly available. A number 
of tests conducted in [82, 83] show indeed that a shearlet-based approach provides 
a very accurate estimate of the edge orientation of a noisy image; this method 
significantly outperforms the wavelet-based approach. A typical numerical exper¬ 
iment is illustrated in Fig. 3.10, where the test image is the characteristic func¬ 
tion of a disc. This figure displays the average angular error in the estimate of the 
edge’s orientation, as a function of the scale a. The average angle error is defined by 
where E is the set of edge points, 6 is the exact angle, and 6 the estimated angle. 
The average angle error is indicated for both shearlet- and wavelet-based meth¬ 
ods, in the presence of additive Gaussian noise. As the figure shows, the shearlet 
approach significantly outperforms the wavelet method, especially at finer scales, 
and is extremely robust to noise. 



Fig. 3.10: (a) Test image, (b-c) Comparison of the average error in angle estimation using the 
wavelet method versus the shearlet method, as a function of the scale, with different noise levels; 
(b) PSNR = 16.9 dB, (c) PSNR = 4.9 dB. (Courtesy of Sheng Yi.) 
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Using these properties, a very competitive algorithm for edge detection was 
developed in [83] and a representative numerical test is illustrated in Fig. 3.11. We 
refer to [82, 83] for details about these algorithms and for additional numerical 
demonstrations. 



Fig. 3.11: Comparison of edge detection using a shearlet-based method versus a wavelet-based 
method. From top left, clockwise: original image, noisy image (PSNR = 24.59 dB), shearlet result, 
and wavelet result. (Courtesy of Glenn Easley.) 


3.5.3 Discrete Shearlet System 


By sampling the continuous shearlet transform 

/ >->• y v f{a,s,t) = (/, 

on an appropriate discrete set of the scaling, shear, and translation parameters 
(a,s,t) £ M + xRx M 2 , it is possible to obtain a frame or even a Parseval frame 
for L 2 (M 2 ). Notice that, as above, we will only consider the case 

To construct the discrete shearlet system (see [157] for more details), we start by 
choosing a discrete set of scales {aj}j e z C M + ; next, for each fixed j, we choose 
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the shear parameters {sj^}i e z C M so that the directionality of the representation is 
allowed to change with the scale. Finally, to provide a “uniform covering” of M 2 , we 
allow the location parameter to describe a different grid depending on j on £; hence, 
we let tjjfi = B Sj£ A aj k , k G Z 2 , where the matrices B s , for s G M, and A a , for a > 0, 
are given by (3.38). Observing that 

r ^{B S j gA a jk}DB s . (>A a . B)B s . ^A a . Bfc , 

we obtain the discrete system 

( Wj,e,k = Db s . e A aj TkW- j > l s Z, k e Z 2 ). 

In particular, we will set aj = 2 2/ , sj f = £ .yZTJ = 12K Thus, observing that B e2 jA 2 2 j = 
A 2 2 j B, , we finally obtain the discrete shearlet system 

{ Vj,l,k = At 4 ; D B( T k V : eZ.Ie Z 2 }. (3.40) 

Notice that (3.40) is an example of the affine systems with composite dilations 
(3.31), described in Section 3.4. More specifically, the discrete shearlet system 
obtained above is similar to the “shearlet-like” system (3.30). Unlike the system 
(3.30), however, whose elements are characteristic functions of sets in the frequency 
domain, we will show that in this case we obtain a system of well-localized 
functions. 

To do that, we will adapt some ideas from the continuous case. Namely, for any 

£ = (<^ 2 )el 2 , ^0, let 

where G C°°(M), suppi/q c [ — 1/2,—1/16] U [1/16,1/2] and suppiC 

[—2,2] . This implies that xj/W is a compactly supported C°° function with support 
contained in [ —1/2,1/2] 2 . In addition, we assume that 

X|VM2- 2J ®)| 2 = 1 for |®| > 1 (3.41) 

j> 0 6 

and, for each j > 0, 

v-\ 

y \xj/ 2 (2 j (O~£)\ 2 = 1 for |o)| < 1. (3.42) 

i=-2i 

From the conditions on the support of l ]f\ and fo, one can easily deduce that the 
functions y/ji.k have frequency support contained in the set 


{(«!,&):& e [ — 2 22_ 1 , — 2 2;_4 ] U [2 2 - 7-4 ,2 2 - ,_ 1 ], 


|+^2 - J 


<2~ J 


} 
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Thus, each element \jfj^k is supported on a pair of trapezoids of approximate size 
2 2; x 2 J , oriented along lines of slope [see Fig. 3.12(b)]. 

From Eqs. (3.41) and (3.42), it follows that the functions {\form a tiling of 
the set 



Hi i > 


§2 

|l 



Indeed, for (^ 1 ,^ 2 ) e %, 


I |v^a 4 -V)I 2 = 

j> 0£=-2J 


V-\ 


I s 

j>oe=-2j 


Vi(2 2 j '|i)| 2 |V2 




An illustration of this frequency tiling is shown in Fig. 3.12(a). 


(3.43) 


& 



' 2 J 


2 2 J 


(b) 



Fig. 3.12: (a) The tiling of the frequency plane R 2 induced by the shearlets. The tiling of ^ is 
illustrated by solid lines, while the tiling of % appears is in dashed lines, (b) The frequency support 
of a shearlet Yj/,k satisfies parabolic scaling. The figure shows only the support for <[;i > 0; the 
other half of the support, for < 0, is symmetrical. 


Letting L 2 (^) v = {/ G L 2 (M 2 ) : supp/ C 3>h}> property (3.43) and the fact that 
\jfW is supported inside [ — 1 / 2 , 1 / 2] 2 imply that the discrete shearlet system 

'lf ) = WjU : j > 0, -V < i < V — 1, k e Z 2 } 

is a Parseval frame for L 2 (f^) v . Similarly, we can construct a Parseval frame for 
L 2 (%) v , where % is the vertical cone % = {(<^i, G M 2 : || > 1/8, | /fe1 < 1 }• 
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Specifically, let 



and let be given by 

^ v) (€)=^ v) &,&)= 

Then the collection 

^i v) = {yflc : ./>0, —2 7 < / ; < 2 J — 1, ke Z 2 }, 

where Yj 'h- = D^D^T k is a Parseval frame for L 2 (£> v ) v . 

Finally, let (M 2 ) be chosen to satisfy 

\m\ 2 +l I 1 i^WV)i 2 »^) 

j>0£=-2J 

+ 1 I 1 |^ v) \ 2 X@ v (4) = 1 , for £ e K 2 , 

j>0£=-2J 

where ^ is the indicator function of the set 3). This implies that suppcp C 
[ — 1/8,1/8] 2 , |<p(^)| = 1 for ^ G [ —1/16,1/16] 2 , and the collection {(p^ : k G Z 2 } 
defined by <p&(x) = cp(x — k) is a Parseval frame for L 2 ([ — 1 /16,1 /16] 2 ) v . 

Thus, letting for to = A or to = v, we have the 

following result. 

Theorem 3.16. TTze discrete shearlet system 

{<p k : k € Z 2 } (J { yjJfcM : 7 > 0 , t = - 2 7 ', 2 7 ' - 1 , k e Z 2 , co = h, v} 

U { Vj %( x ) : J ^ °> ~ 2 ’ + 1 < e < 27 - 2, k e Z 2 , co = A, v} 

w a Parseval frame for L 2 (M 2 ). 

The “corner” elements \j/j°^ k (x), t — —iKV — 1, are simply obtained by truncation 
on the cones in the frequency domain. Notice that the corner elements in the 
horizontal cone match nicely with those in the vertical cone We refer to 
[80, 130] for additional details on this construction. 


3.5.4 Optimal Representations Using Shearlets 

One major feature of shearlet systems is that if / is a compactly supported function 
that is C 2 away from a C 2 curve, then the sequence of discrete shearlet coefficients 
{(/, i ffj^k)} has (essentially) optimally fast decay. To make this more precise, let 
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fx be the A-term approximation of / obtained from the A largest coefficients of its 
shearlet expansion, namely, 


Jn = X (f,Yn)Yn, 

where In C M is the set of indices corresponding to the A largest entries of the 
sequence {|(/, V^u)| 2 : /i G M}. Also, we follow [38] and introduce STAR 2 (A), a 
class of indicator functions of sets B with C 2 boundaries dB. In polar coordinates, 
let p (0) : [0, 2k) —> [0, 1 ] 2 be a radius function and define B by x G B if and only if 
\x\ < p(0). In particular, the boundary dB is given by the curve in M 2 : 

«•>-(?(?,S3S)- 

The class of boundaries of interest to us is defined by 

sup|p"(0)| <A, p<Po<l. (3.45) 

We say that a set B G STAR 2 (A) if B C [0,1 ] 2 and 5 is a translate of a set obeying 
(3.44) and (3.45). Finally, we define the set S ’ 2 (A) of functions that are C 2 away 
from a C 2 edge as the collection of functions of the form 

f = fo + fiXB, 

where/o,/! G Cg([0,1 ] 2 ), B G 5TA/? 2 (A), and \\f \\ C 2 = S|«|<2 ll^/IU < 1- We can 
now state the following result from [124]. 

Theorem 3.17. Let f G S’ 2 (A) and be the approximation to f defined above. 
Then 

Wf-f S N \\\<CN- 2 (\ogN)\ 

Notice that the approximation error of shearlet systems significantly outperforms 
wavelets, in which case the approximation error \\f — || 2 decays at most as fast 

as 0(N~ l ) [175], where is the A-term approximation of / obtained from the 

N largest coefficients in the wavelet expansion. Indeed, the shearlet representation 
is essentially optimal for the kind of functions considered here, since the optimal 
theoretical approximation rate (cf. [65]) satisfies 

ll/-/iv||2^V- 2 , o. 

Only the curvelet system of Candes and Donoho is known to satisfy similar 
approximation properties [38]. However, the curvelet construction has a number 
of important differences, including the fact that the curvelet system is not associated 
with a fixed translation lattice and, unlike the shearlet system, is not an affine-like 
system, since it is not generated from the action of a family of operators on a single 
or finite family of functions. 

The optimal sparsity of the shearlet system plays a fundamental role in a number 
of applications. For example, the shearlet system can be applied to provide a sparse 
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representation of Fourier integral operators, a very important class of operators that 
appear in problems from partial differential equations [125, 126]. Another class 
of applications comes from image processing, where the sparsity of the shearlet 
representation is closely related to the ability to efficiently separate the relevant 
features of an image from noise. A number of results in this direction are described 
in [79-81]. 
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1. Show that Eq. (3.12) in Theorem 3.5 can be simplified to obtain (3.13). Next, 
show that, for n= 1, when the dilation matrix a is replaced by the dyadic factor 
2, Eq. (3.13) yields the “classical” Gripenberg-Wang equations (3.4) and (3.5). 

2. Show that the matrices 


M = 



(a 0 0 \ 

and M = I 0 cos 0 — sin 6 I , 
\0 sin# cos 6 J 


where a > 1, are expanding on a subspace (that is, they satisfy Definition 3.7). 

3. Show that Theorem 3.8 is valid for functions on subspaces of L 2 (R n ) of the form 

L 2 (V) v = {/ g L 2 (K") : supp/ c V}. 

4. Let yf\ G L 2 (M) be a dyadic wavelet with supp y/\ C [ —1/2,1/2] and yr 2 G L 2 (M) 
be such that supp l/q c [ — 1,1] and 

X \ V 2 {(D + k)\ 2 = 1 for a.e. to G M. 

k(EZ 


For £ = (£i,£ 2 ) € let y/ be defined by y/(£) = 1/1 (<^i) vfe(&/&)• Show 
that the affine system : ij G Z,& G Z 2 }, where A = ^ and 

# = ^ , is a Parseval frame for L 2 (M 2 ). 

5. Prove Proposition 3.14 by modifying the argument of Proposition 3.14. 

6. Let y/ be a Schwarz class function and be the fine-scale continuous shearlet 
transform (for a = 1/2), as defined in this chapter. Show that, for any s G M, the 
continuous shearlet transform of the Dirac delta distribution satisfies 


y w 8{a,s, (0,0)) ~ a 4, 


asymptotically as a —> 0. Show that if t 7^ (0,0), then, for any N G N, there is a 
constant Cn > 0 such that 


jT’yfSfas, ( 0 , 0 )) <Cnci n , 


asymptotically as a —> 0. 

7. Let y/ and <5^ be as in Exercise 6. For p G M, consider the distribution v p (x \, v 2 ) 
defined by 


/ Vp(*i,* 2 )/(*1,* 2 ) Jvi Jv 2 = / f(px 2 ,x 2 )dx 2 . 

Jr 2 Jr 
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Show that, for s = —p and t\ = pt 2 , we have 

y ¥ 8(a,s,(ti,t 2 )) 

asymptotically as a —> 0. Show that for all other values of t — (t\,t 2 ) or s, then, 
for any N G N, there is a constant C/v > 0 such that 

( 0 , 0 )) <Cncl N , 


asymptotically as a —> 0. 


Chapter 4 

Wavelets on the Sphere 


Pierre Vandergheynst and Yves Wiaux 


Abstract In many application fields, ranging from astrophysics and geophysics 
to neuroscience, computer vision, and computer graphics, data to be analyzed are 
defined as functions on the sphere. In all these situations, there are compelling rea¬ 
sons to design dedicated data analysis tools that are adapted to spherical geometry, 
for one cannot simply project the data in Euclidean geometry without having to 
deal with severe distortions. The wavelet transform has become a ubiquitous tool 
in signal processing mostly for its ability to exploit the multiscale nature of many 
data sets, and it is thus quite natural to generalize it to signals on the sphere. This 
generalization is not trivial, for the main ingredient of wavelet theory, dilation, is 
not well defined on the sphere. Moreover, when turning to algorithms, one faces 
the problem that sampling data on the sphere is not an easy task either. In this 
chapter, we discuss recently developed results for the analysis and reconstruction 
of signals on the sphere with wavelets, on the basis of theory, implementation, and 
applications. 


4.1 Introduction 

There are many application scenarios where data to be analyzed are defined as a 
scalar function on the sphere. Some of the most common examples include process¬ 
ing geodesic signals, climate indicators (atmospheric or ocean temperature, for 
example), or astronomical data defined on the celestial sphere. Recently, with the 
advent of advanced imaging modalities and devices, data sets defined in spher¬ 
ical geometry have started to appear in many other areas. In computer vision, 
catadioptric cameras allow one to record omnidirectional images using a regular 
sensor overlooking a curved mirror. The captured images are most naturally 
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expressed in spherical coordinates around the focal point of the system. In com¬ 
puter graphics, complicated but closed genus-zero surfaces are expressed as eleva¬ 
tion maps in spherical coordinates. One then seeks to process these surfaces so as to 
reveal or simplify their shape attributes. 

In all these situations, there are compelling reasons to design dedicated data 
analysis tools that are adapted to spherical geometry, for one cannot simply project 
the data in Euclidean geometry without having to deal with severe distortions. The 
wavelet transform has become a ubiquitous tool in signal processing mostly for its 
ability to exploit the multiscale nature of many data sets, and it is thus quite natural 
to generalize it to signals on the sphere. However, this generalization is not trivial, 
for the main ingredient of the wavelet theory, dilation, is not well defined on the 
sphere. Moreover, when turning to algorithms, one faces the problem that sampling 
data on the sphere is not an easy task either: There is no preferred sampling grid 
similar to the Z 2 lattice in the plane. 

There have been many attempts at generalizing the wavelet transform to the 
sphere, and it is well beyond the scope of the present chapter to review all 
approaches. Instead, we will focus on recently developed results that provide a sen¬ 
sitive mathematical framework and that can be efficiently implemented by provably 
stable and fast algorithms. We will proceed by first defining and studying several 
possible dilation operations on the sphere. A continuous wavelet formalism will then 
be simply defined by generalizing the operation of correlating a signal with suitably 
dilated waveforms. We will then define a scale-discretized wavelet formalism in 
order to allow the practical reconstruction of a signal from its wavelet coefficients. 
We will also discuss fast algorithms allowing efficient analysis and reconstruc¬ 
tion of digital data. Finally, we will conclude with applications in astrophysics and 
neuroscience illustrating the usefulness of the proposed tools. 


4.2 Scale-Space Premises 

In this section, we discuss the notion of directional correlation on the sphere, we 
concisely recall the harmonic analysis on the sphere and on the rotation group, and 
we discuss affine transformations, in particular dilations. These are the essential 
tools for the definition of a wavelet formalism. 


4.2.1 Directional Correlations 

We consider a three-dimensional Cartesian coordinate system (o, ox, oy, oz) centered 
on the unit sphere § 2 , and where the direction oz identifies the north pole. Any point 
co on the sphere is identified by its corresponding spherical coordinates ( 6,cp ), 
where 6 G [0,7r] stands for the colatitude, or polar angle, and cp G [0,27t) for the 
longitude, or azimuthal angle. We consider signals F and analysis functions ¥ 
on the sphere as described by elements of the Hilbert space of square-integrable 
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functions L 2 (§ 2 , dQ), with the invariant measure dQ = d cos 6d(p. In this space, 
the scalar product between two functions F\ and F \ reads as (F\ \F2} = fgz dQ /q* (co) 
F 2 (co). 

In order to perform a scale-space analysis of signals, continuous affine trans¬ 
formations such as translations, rotations, and dilations on the sphere must be 
applied to the analysis function. These affine transformations are mathematically 
defined and described in detail below. In a few words, the continuous translations 
by cqq = (6o,<Po) € S 2 and rotations by % G [0,2tt) are defined by the three Euler 
angles defining an element p = (<Po, 0o?Z) °f the group of rotations in three dimen¬ 
sions SO(3). The continuous dilations affect by definition the continuous scale of 
the function and may be parametrized in terms of some dilation factor a G . The 
precise definition of spherical dilations is the main challenge for building spherical 
wavelets, and clean mathematical arguments will be given in Section 4.2.3. Let us 
simply assume that the operation is formally defined so that we can fix notations. 

The analysis of the signal F with an analysis function W defines wavelet coef¬ 
ficients through the scalar products of F with the translated, rotated, and dilated 
functions as 

W${p } a) = ('V p , a \F). (4.1) 

This relation also defines the so-called directional correlation of F with the 
dilated functions At each scale a , the function Wy(-,a) of p identifying the 
wavelet coefficients is an element of the Hilbert space of square-integrable func¬ 
tions L 2 (SO(3 ),dp) on the rotation group SO(3), with the invariant measure dp = 
dqtdcosOdx- They characterize the signal around each point mp, and in each 
orientation %. This defines the scale-space nature of the wavelet decomposition on 
the sphere. In this context, some basic knowledge of harmonic analysis on both § 2 
and SO(3) is absolutely essential. 


4.2.2 Harmonic Analysis 


4.2.2.1 On the Sphere § 2 


As discussed, any point co on the sphere § 2 may be identified as (0 = (6, <p), with 
6 G [0, 7t\ and <p G [0,2/r), and we consider signals in L 2 (§ 2 ,df2). The harmonic 
analysis in this space may be summarized as follows. The spherical harmonics 
Yi m (co) form an orthonormal basis of L 2 (§ 2 , dQ), with / G N, m G Z, and \m\ < 1. 
They are explicitly given in a factorized form in terms of the associated Legendre 
polynomials P / m (cos 6) and the complex exponentials e im(p as 


f/ra ( @ i ) 


21+1 (/ —m)!l 1/2 


47 T (l + m)\ 


Tf( cos 0)e im( P. 


(4.2) 


The index / represents an overall frequency on the sphere. The absolute value \m\ 
represents the frequency associated with the azimuthal variable cp. The definition 
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(4.2) corresponds to the choice of Condon-Shortley phase (—l) m for the spherical 
harmonics, ensuring the relation 


(-1 ) m Y l * m (co) = Y l{ _ m) (co). (4.3) 

This phase is included in the definition of the associated Legendre polynomials 
[1, 229]. Another convention [31] explicitly transfers it to the spherical harmonics. 
The orthonormality and completeness relations for the spherical harmonics respec¬ 
tively read as 

J s2 dQY l * m ((o)Y l , m ,(co) = S w (4.4) 

and 

I I YL ((O') Y lm (to) = 8 2 (©' - to), (4.5) 

IgN \m\<l 

with the notation 8 2 (co r — co) = <5(cos 6 f — cos 0)8((p' — (p). 

Any function G G L 2 (§ 2 , dQ) is thus uniquely given as a linear combination of 
spheric al harmonic s: 

G(to)= £ X Gi m Y lm (co). (4.6) 

IeN \m\<l 

This combination defines the inverse spherical harmonic transform on § 2 . The 
corresponding spherical harmonic coefficients are given by the scalar products in 

L 2 (§ 2 ,^): 

G lm = f dQYi m (to)G(to), (4.7) 

Js 2 

with / G N, m G Z, and \m\ < /. 

By definition, any function G G L 2 (§ 2 ,(i,Q) explicitly depending on the 
azimuthal angle <p is said to be directional. It exhibits generic spherical harmonic 
coefficients G/ m for / G N, m G Z, and \m\ < /. For a real function G, these spheri¬ 
cal harmonic coefficients also satisfy the reality constraint (—l) m G z * m = G l(-m) Al¬ 
lowing from the symmetry (4.3). Any function G G L 2 (§ 2 ,df2) independent of the 
azimuthal angle (p is said to be zonal, or axisymmetric: G = G(0). It only exhibits 
nonzero spherical harmonic coefficients for m = 0: G/ m = G/o<5 m o. For a real function 
G, these spherical harmonic coefficients also satisfy the reality constraint G z * 0 = G/o, 
still following from the symmetry (4.3). Such an axisymmetric function is obviously 
invariant under rotation around itself by any angle % G [0,2/r). 

Notice that the orthonormality of scalar spherical harmonics implies the follow¬ 
ing Plancherel relation for G L 2 (§ 2 , dQ ): 

(Fl\Fl) = H (?)L(?)lr 

le N \ m \<l 


(4.8) 
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4.2.2.2 On the Rotation Group SO(3) 

Any rotation p in the group of rotations in three dimensions SO(3) is given in 
terms of the three Euler angles p = (<p,0,£), with 0 G [0,/r], and cp,x £ [0,27t). 
We consider signals H in the Hilbert space of square-integrable functions L 2 (SO(3), 
dp), with the invariant measure dp = dcpdcosOdx . The harmonic analysis in this 
space may be summarized as follows. The Wigner D-functions are the matrix 
elements of the irreducible unitary representations of weight / of the group in 
L 2 (SO(3), dp). By the Peter-Weyl theorem on compact groups, the matrix elements 
D l £ n also form an orthogonal basis in L 2 (SO(3), dp), with l G N, m,n G Z, and 
\m \, \n\ <1. They are explicitly given in a factorized form in terms of the real Wigner 
^-functions d l mn (6) and the complex exponentials, e~ im(p and e~ inx , as 

DU<P,e,X)=e- Un *di m (e)e - i »*. ( 4 . 9 ) 

Again, / represents an overall frequency on SO(3), and \m\ and \n\ the frequen¬ 
cies associated with the variables cp and respectively [ 229 , 31]. The Wigner 
d-functions read as 


i = S, (-l) f [(/ + m)! (/- m )\(/ + »)! (/-«)!] 1/2 f cos 0 \ 
mn[ t =c x (l + m-t)\(l-n-t)\t'.(t + n-m)\ \ 2 J 

fsme\ 2t+n - ,n 


with the summation bounds C\ = max(0, m — n) and C 2 = min(/ + m.l — n) defined 
to consider only factorials of positive integers. They satisfy various symmetry prop¬ 
erties on their indices. The orthogonality and completeness relations of the Wigner 
/^-functions respectively read as 


/ dpD l mn (p) D l X, (p) = 2^— 8 W 8 mm i8 nn i (4.10) 

J SO(3) 2/ + 1 

X £> L(p / ) D mn(p) = 5 3 (p , -p), (4.11) 

le N \m\,\n\<l 


with 5 3 (p' — p) = 8((p r — (p)S (cos O' — cos 0)8(x' ~ X)- Notice that for n = 0, the 
Wigner D-functions are independent of x an d simply identify with the spherical 
harmonics: 


47T 


1/2 


D' m o (®) 


21+1 


YL(CO). 


( 4 . 12 ) 
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Any function H G L 2 (SO(3),dp) is thus uniquely given as a linear combination 
of Wigner /^-functions: 

9/1 i ^ 

= I H L D m n (p) • (4.13) 

le N 671 \m\,\n\<l 

This combination defines the inverse Wigner D-function transform on SO(3). The 
corresponding Wigner £>-function coefficients are given by the scalar products in 
L 2 (SO(3),dp): 

Kn = I dpD l mn (p)H(p), (4.14) 

J SO(3) 

with IgN, and | m \M\ <i- 


4.2.3 Affine Transformations 

4.2.3.1 Translations and Rotations 

Continuous translations and rotations of square-integrable functions on the sphere 
are described by the three Euler angles defining an element p = ( 90 , @o,X) i n SO(3). 
The operator 7?(oio) in L 2 (§ 2 , dQ) for the translation of amplitude (Do = (0o, <Po) of 
a function G reads as 


G (O0 (co) = [R(co 0 )G}(co) = G(R^co), (4.15) 

where ^^( 0 , 9 ) = [R^R^J^G ^ <p) is defined by the three-dimensional rotation 
matrices R y e ^ and R z (pQ , acting on the Cartesian coordinates (x,y,z) associated with 
co = (0, (p). The rotation operator R z (x) i n L 2 (§ 2 ,dQ) for the rotation of the func¬ 
tion G around itself, by an angle x C [0, 2k), is given as 

G x (£0) = [/? £ (i)G] (£0) = G (4' 1 ®) , (4.16) 

where R z x (0 , cp) = (0 , cp + x) also follows from the action of the three-dimensional 
rotation matrix R z x on the Cartesian coordinates (x,y,z) associated with co = (0, <p). 
The operator incorporating both the translations and rotations simply reads as 
R(p) = R(coo)R z (x) and G p (co) = [R(p)G](co) = G(R~ 1 co ), with R p = Rq^R^. 
Notice that the action of the operator R(p) on G G L 2 (S 2 ,JI2) reads in terms of 
its spherical harmonic coefficients as 
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4.2.3.2 Stereographic Dilation 

As already stated, there is no natural dilation operator for functions defined on S 2 . 
One of the main compelling reasons behind this difficulty is the compactness of the 
sphere: It is not possible to linearly scale the geodesic distance between points as 
one would in M 2 . However, intuitively at least, the size of compact features on the 
sphere may be associated with a given scale. By analogy with the Euclidean case, 
dilation may a priori be defined both in real or harmonic space on § 2 . But as we 
shall see, and contrary to the Euclidean setting, these definitions do not necessarily 
coincide. This will lead us to study several proposed dilations, each of which is well 
defined mathematically and has particular advantages. We will thus formulate the 
wavelet formalism so that we can incorporate those various definitions into a single, 
unifying framework. 

The stereographic dilation of functions is a natural candidate if one wants to 
define dilations explicitly in real space on § 2 . This naturally appears in the wavelet 
formalism on the sphere originally proposed by (3-6), further developed by [235, 
236], and reviewed in [7, 237]. The stereographic dilation operator D(a) on G G 
L 2 (§ 2 , dQ ), for a continuous dilation factor a G , is defined in terms of the inverse 
of the corresponding stereographic dilation D a on points in § 2 . It reads as 

G a {(0) = [D{a)G]{(0) 

= Pi 1/2 {a,e)G(D~ 1 (o), (4.18) 

with X 1 ^ 2 (a,0) = a~ l [ 1 + tan 2 (0/2)]/[l + a _2 tan 2 (0/2)]. The dilated point is 
given by D a ( 0 , <p) = (0 a (0),(p) with the linear relation tan (0 a ( 6 )/2) = a tan( 6 /2). 
The dilation operator therefore maps the sphere without its south pole on itself: 
6 a (0) : 0 G [0,/r) —► 6 a G [0,tt). This dilation operator is uniquely defined by 
the requirement of the following natural properties. The dilation of points on § 2 
must be a radial (i.e., only affecting the radial variable 6 independently of <p, 
and leaving cp invariant) and conformal (i.e., preserving the measure of angles 
in the tangent plane at each point) diffeomorphism (i.e., a continuously differ¬ 
entiable bijection). The normalization by A 1 / 2 (a,0) in (4.18) is uniquely 
determined by the requirement that the dilation of functions in L 2 (§ 2 , dQ) be a 
unitary operator [i.e., preserving the scalar product in L 2 (§ 2 , dQ), and specifically 
the norm of functions]. Notice that the stereographic dilation operation is sup¬ 
ported by a group structure for the composition law of the corresponding operator 
D(a). A group homomorphism also holds with the operation of multiplication by 
a on . 

Finally, in the Euclidean limit where a function is localized on a small portion of 
the sphere, this portion is assimilated to the tangent plane, and the stereographic 
dilation identifies with the standard dilation in the plane [6, 235], which is the 
expected geometric behavior in this asymptotic regime. 
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4.2.3.3 Harmonic Dilation 

Another possible definition of the dilation of functions may be considered, that 
which is explicitly defined in harmonic space on S 2 . It was proposed in previous 
developments relative to the definition of a wavelet formalism of axisymmetric 
[97, 96] and also directional [140, 186] wavelets on the sphere. The harmonic dila¬ 
tion is defined directly on G G L 2 (§ 2 , dQ) through a sequence of prescriptions rather 
than in terms of the application of a simple operator. First, an arbitrary prescription 
must be chosen to define a set of generating functions G m (k) of a continuous vari¬ 
able k G M+ for each m G Z. These functions are identified to the spherical harmonic 
coefficients of G through G m (l) = G/ m for l G N, and \m\ <1. Second, the variable 
k is dilated linearly, k —l — > k = al, just as would be the norm of the Fourier fre¬ 
quency on the plane. For a continuous dilation factor a G M+, the spherical harmonic 
coefficients of the dilated function G a are defined by 

(GT) /m = G m (aZ). (4.19) 

Again, in the Euclidean limit where a function is localized on a small portion of 
the sphere, this portion is assimilated to the tangent plane, and the harmonic dilation 
identifies with the standard dilation in the plane [140]. 

Notice that in the framework of scale-space signal processing through the linear 
heat flow on the sphere [34, 32, 33], the harmonic dilation applied to axisymmetric 
filters appears to be an extremely natural procedure. Considering the heat diffusion 
equation on the sphere, one may understand the signal F to be analyzed as an ini¬ 
tial temperature distribution, at time t = 0. The analysis of the signal is performed 
through the analysis of the temperature distribution at any instant t G M+ in the 
course of the diffusion process, which reveals larger and larger scales in the signal. 
The heat kernel is an axisymmetric function ^heat defined as a function of time t by 
the following spherical harmonic coefficients: 


[(O f L = \l^- e ~ l{l+l) ’ 8 >n0- (4.20) 

The solution of the heat equation for an initial condition F simply results from its 
scalar products with the heat kernel at any (Oq G S 2 and t G M+: 

wL^>0 = <(«k-W^ ( 4 - 21 ) 

Att = 0, the heat kernel is given by the optimally localized Dirac delta distribu¬ 
tion and the initial temperature distribution identifies with the signal itself. In the 
limit t —> oo, the kernel is a constant function on the sphere with the unique coeffi¬ 
cient [(^heat)*— kxJoo = (4tt) _ 1//2 and the well-known constant asymptotic tempera¬ 
ture distribution identifies with the mean of the signal. The dilation process in this 
context is very similar to the harmonic dilation applied to axisymmetric functions. 
A generating function for can be defined as d>h eat (k) = and the continuous 
variable k G M+ is dilated linearly. However, this variable is not identified with the 
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spherical harmonic index / itself, but with /(/ + 1). Notice that the dilation process is 
additive in the time variable t G M+, rather than multiplicative in the corresponding 
dilation factor a G in the case of the harmonic dilation. Directional filters were 
also considered in this context [32]. 

On the one hand, the very simple action of the harmonic dilation in harmonic 
space also exhibits several advantages relative to the stereographic dilation. Notably, 
the harmonic dilation ensures that the band limit of a wavelet and of the correspond¬ 
ing wavelet coefficients is reduced by a factor a. Such a multiresolution property is 
essential in reducing the memory and computation time requirements for the wavelet 
analysis of signals. On the other hand, the harmonic dilation lacks some of the 
important properties that hold under stereographic dilation. Notably, as the 
harmonic dilation does not act on points, the question of the corresponding proper¬ 
ties of a radial and conformal diffeomorphism make no sense. The harmonic dilation 
of functions is also not a unitary procedure. Moreover, as the harmonic dilation is 
explicitly defined in harmonic space, the evolution in real space of the localization 
and directionality properties of functions on the sphere through harmonic dilation 
is not known analytically. In order to circumvent this last drawback, the definition 
of harmonic dilation may be slightly amended to obtain the kernel dilation defined 
next. 


4.2.3.4 Kernel Dilation 

A function G G L 2 (§ 2 , dQ) can be defined to be a factorized function in harmonic 
space if it can be written in the form 



(4.22) 


for l G N and \m\ <1. The positive real kernel Kc(k) G M+ is a generating function 
of a continuous variable k G M+, initially evaluated on integer values k = /. The 
directionality coefficients Sf m , for / G N and \m\ < /, define the directional split 
of the function. In particular, for a real function G, they bear the same symmetry 
relation as the spherical harmonic coefficients G/ m themselves: Sf£ = (—1 ) m Sf,_ m y 
Without loss of generality, one can impose 


I |s£l 2 = i, 


(4.23) 


\m\<l 


for the values of / for which Sf m is nonzero for at least one value of m. Hence, 
localization properties of a function G, such as a measure of dispersion of angular 
distances around its central position as weighted by the function values, are gov¬ 
erned by the kernel and to a lesser extent by the directional split. Indeed, the power 
contained in the function G at each allowed value of / is fixed by the kernel only. 
The norm of G G L 2 (§ 2 ,df2) reads as | |G| | 2 = E/gN^g(0» where the sum runs over 
the values of l for which Sf m is nonzero for at least one value of m. However, the 
directional split is essential in defining the directionality properties measuring the 


140 


Pierre Vandergheynst and Yves Wiaux 


behavior of the function with the azimuthal variable (p , because it bears the entire 
dependence of the spherical harmonic coefficients of the function in the index m. 

The kernel dilation applied to a factorized function (4.22) is simply defined by 
application of the harmonic dilation (4.19) to the kernel only. The directionality of 
the dilated function is defined through the same directional split as the original func¬ 
tion. For a continuous dilation factor a G M+, the dilated function therefore reads as 

0 Go)i m =K G {al)Sf m . (4.24) 

Let us emphasize that the directionality coefficients Sf m are not affected by dila¬ 
tions, contrary to what the complete action of the harmonic dilation (4.19) would 
imply. The specific directionality properties of the function may, however, be modi¬ 
fied through kernel dilation due to the modification of the values of l identifying the 
dilated kernel. Also, notice that the kernel and harmonic dilations strictly identify 
with one another when applied to factorized axisymmetric functions A, for which 
the directional split takes the trivial values Sf m = 8 m o for l G N. 

Any function G G L 2 (§ 2 , dQ) can be said to have a compact harmonic support 
in the interval l G ([a~ l B\ ,#), for any B G N° and any real value a > 1, if 

G/ m = 0 for all /,m with / ^ ([ a~ l B\ ,B ), (4.25) 

where [xj denotes the largest integer value below x G R. Notice that the compactness 
of the harmonic support of G can be defined as the ratio of the band limit to the width 
of its support interval. For a factorized function G of the form (4.22), the compact 
harmonic support in the interval l G (, B) is ensured by the choice of a kernel 
with compact support in the interval k G (a -1 #,#): 

Kq ( k ) =0 for k £ {a~ l B,B). (4.26) 

The compactness of the harmonic support of G can simply be estimated from the 
compact support of the kernel as c(a) = a/{a — 1) G [1,°°). One has c(a) —> °o 
when a —> 1, and c(a) —► 1 when a —■» °°. Typical values would be a = 2, cor¬ 
responding to a compactness c(2) = 2, or a = 1.1, leading to a higher compact¬ 
ness c(l.l) = 11. By a kernel dilation with a dilation factor a G in (4.24), 
the compact support of the dilated kernel Kg ( ak ) G M+ is defined in the inter¬ 
val k G (a~ l a~ l B^a~ l B). The compact harmonic support of the dilated function 
G a itself is thus defined in the corresponding interval l G ( \a~ l a~ l B\ , [a -1 /?]), 
where \x] denotes the smallest integer value above xGl. In particular, the com¬ 
pactness of the harmonic support of a function remains invariant through a kernel 
dilation. 

The factorization and compact harmonic support, together with the notion of 
steerability introduced in Section 4.3.1, have been shown to be important properties 
that ensure good control of the evolution of localization and directionality properties 
of functions through kernel dilation [238]. In particular considering functions with 
directionality coefficients Sf m that become independent of / in the limit / —> ©o, it can 
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be shown that the kernel dilation identifies with the standard dilation in the plane in 
the Euclidean limit [140]. 

Several definitions of dilations in real or harmonic space may hold for the 
development of a wavelet formalism on the sphere, each of which exhibits its 
specific advantages. In the following, we review various wavelet formalisms with 
different definitions for a dilation operation. We review both continuous and discrete 
formalisms, respectively identified by continuous and discrete position, orientation, 
and scale parameters. 


4.3 Continuous Formalism 

In this section, we begin with a definition of a continuous wavelet formalism 
relying on a generic dilation operation. We notably introduce the notion of steerabil¬ 
ity of the analysis function, which will reveal to be essential in many respects in the 
context of a wavelet formalism. We then consider the cases of the stereographic and 
kernel dilations. We also comment on the necessary discretization of the translation, 
rotation, and dilation parameters in the perspective of practical implementations of 
the wavelet formalism. 


4.3.1 Generic Wavelets 

4.3.1.1 Directional Case 

We consider the general case of analysis of a real signal F with a real and directional 
analysis function ¥. In a continuous wavelet formalism, the wavelet coefficients 
Wy (p, a) of F with ¥ are defined at each continuous scale a G M+, around each 
continuous point cop G § 2 , and in each continuous orientation % G [0,2/r), through 
the directional correlations (4.1). These coefficients living on SO(3) follow a very 
simple expression in harmonic space. At each scale a , the directional correlation 
reads as an inverse Wigner D-function transform: 



(4.27) 


The Wigner D-function coefficients in this relation follow from relations (4.8) and 
(4.17) as the pointwise product of the spherical harmonic coefficients of the signal 
and the wavelet: 



0^L» 


(4.28) 


The operation of decomposition of the signal F in its wavelet coefficients 
with the analysis function ¥ may somewhat abusively be called analysis. An 
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essential feature of a wavelet formalism, by opposition with generic filtering, resides 
in the formal possibility of reconstruction of the signal from its wavelet coefficients. 
This actually raises the analysis function to the rank of a wavelet. A reconstruction 
formula 

F(a>)= [ dn(a) [ dpWy (p,a) [R (p) L>p'F a ] (co) (4.29) 

Jr* + J so(3) 

directly follows from (4.28) for a generic scale integration measure dll (a). This 
measure is fixed by each formalism relying on a specific definition of the dilation 
operation. The operator Ly in L 2 (§ 2 ,df2) is defined by its action on the spherical 
harmonic coefficients of a function G: LyGi m = G/ m /C^. The reconstruction for¬ 
mula holds if and only if the analysis function satisfies the following admissibility 
condition for all / G N: 

0 < C^, = ^ 7 —t X / dp(a)\(%)J 2 <oo. (4.30) 

21+1 \ m\<l J K 

This intuitively requires that the whole wavelet family {¥ a (co)}, for a G M+, covers 
each frequency index / with a finite and nonzero amplitude, hence preserving the 
signal information at each frequency. 


4.3.1.2 Steerability 

The steerability of a function G G L 2 (§ 2 ,df2) represents a notion of controlled 
directionality. By definition, a function G G L 2 (§ 2 , dQ) is steerable if any rotation 
of the function around itself may be expressed as a linear combination of a finite 
number M of basis functions G p : 

m- 1 

G x (co)=J j k p ( X )G p (co). (4.31) 

P =0 

The square-integrable functions k p (x) on the unit circle S 1 = [0,2/r), with 0 < P < 
M —l, andM e N°, are called interpolation weights. 

The generic continuous wavelet formalism described is obviously directly 
applicable to steerable analysis functions. If the analysis function ¥ is steerable 
with M basis functions ¥ p and weights k p (%), the linearity of the directional corre¬ 
lation (4.1) automatically implies that the steerability relation holds identically on 
the wavelet coefficients: 

M—l 

Wp(p,a)= XM*)w£(«mO> ( 4 -32) 

p =0 

for p = ((po,Oo,x) an( i — (9o?^o)- In this relation, the wavelet coefficients 
((Oo,a) simply follow from the standard correlations with the basis functions ¥ p : 

W^(co 0 ,a) = (('F p ) aota \F). 


( 4 . 33 ) 
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Consequently, at a scale a and a point oip, the exact value of the wavelet coefficient 
in any continuous orientation % is known on the basis of the computation of the 
finite number M of the coefficients Wy (coo,a). 

This property was actually first introduced on the plane [98, 212], and more 
recently defined on the sphere [235, 238]. It is of great interest in the context of 
wavelet analysis in a computational perspective both for analysis of local signal 
orientations (see Section 4.4.2.3) as well as for exact signal reconstruction (see Sec¬ 
tion 4.6.1). In this perspective, this notion of steerability is further discussed in the 
following paragraphs and an equivalent definition in harmonic space is established. 

Intuitively, steerable functions have a nonzero angular width in the azimuthal 
angle <p, which renders them sensitive to a range of directions and enables them to 
satisfy the steerability relation. This nonzero angular width naturally corresponds 
to an azimuthal band limit N G N° in the frequency index m associated with the 
azimuthal variable <p: 

G/ m = 0 for all /,m with \m\>N. (4.34) 

It can actually be shown that the property of steerability (4.31) is equivalent to the 
existence of an azimuthal band limit N (4.34). First, if a function G is steerable 
with M basis functions, then the number T of values of m for which G/ m has a 
nonzero value for at least one value of / is less than or equal to M: M > T. This 
was first established for functions on the plane [98], and the proof is absolutely 
identical on the sphere. As a consequence, the function has some azimuthal band 
limit N , with T < 2N — 1. Second, if a function G has an azimuthal band limit N , 
then it is steerable, and the number of basis functions can be reduced at least to 
M = 2 N — 1. This second part of the equivalence can be proved by explicitly deriv¬ 
ing a steerability relation for band-limited functions with an azimuthal band limit 
N. Any band-limited function G can in particular be steered using M rotated ver¬ 
sions G Xp = R z (x P )G as basis functions, and interpolation weights given by simple 
translations by % p of a unique square-integrable function k(%) on the circle S 1 : 


M— 1 

G x (®) = L k (x~ Xp) G Xp (to), (4.35) 

p =o 

for specific rotation angles Xp with 0 < p < M — 1. One may choose M = 2N — 1 
equally spaced rotation angles Xp £ [0,2/r) as Xp = 2np/(2N — 1), with 0 <P< 
2N — 2. The function k(x) is then defined by the Fourier coefficients k m = 1 /(2 N — 1) 
for \m\ < N — 1, and k m — 0 otherwise. Notice that the angles Xp an d the structure 
of the function k(x) are independent of the explicit nonzero values G/ m . 

Typically, if G/ m has a nonzero value for at least one value of / for all m with 
\m\ < N — 1, then T = 2 N — 1 and the function is optimally steered by these M — T 
angles and the function k(x) described. On the contrary, when values of m, with 
\m\ < N — 1, exist for which G/ m = 0 for all values of /, then T < 2N — 1 and one 
might want to reduce the number M = 2 N — 1 of basis functions. Depending on the 
distribution of the T values of m for which G/ m has a nonzero value for at least one 
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value of /, the number of basis functions required to steer the band-limited function 
may indeed be optimized to its smallest possible value M = T. This optimization 
is notably reachable for functions with specific distributions of the T values of m, 
corresponding to particular symmetries in real space. For example, a function G is 
even or odd through rotation around itself by % = n if and only if G/ m has nonzero 
values only for, respectively, even or odd values of m. This property notably implies 
that the central position of the function G identifies with the north pole, in the sense 
that its modulus \G\ is always even through rotation around itself by % = n. The 
combination of an azimuthal band limit N with that symmetry reads as 

Gim = 0 for all /,m with m £ 7#, (4.36) 

with 

T n = {-(N-1),-(N-3),...,(K-3),(N-1)}. (4.37) 

In this particular case, T = N and one may choose M = N equally spaced rotation 
angles % p G [0, n) as % p = np/N , with 0 < p < N — 1, and steer the function through 
relation (4.35). The function k(x) is defined by the Fourier coefficients k m = l/N 
for m G 7]v, and k m = 0 otherwise. 

The lower the azimuthal band limit N of the filter, the smaller the number of basis 
functions M required for its steerability. In particular, the axisymmetry of a function 
may be understood as an extreme case of steerability, for an azimuthal band limit 
N = l, and a number of basis functions M = 1. By opposition, one may understand 
a function as optimally directional if it is only sensitive to the specific direction x 
in which it is rotated. Such a function would have an azimuthal dependence rsj m 
thus containing nonzero coefficients for an infinite number of values of m, i.e., 
N —> oo. An infinite number of weights would thus be required to steer such a func¬ 
tion. Optimal directionality and steerability are thus competing concepts. 


4.3.1.3 Axisymmetric Case 

We consider the particular case of analysis of a real signal F with a real and 
axisymmetric analysis function 0, in a continuous wavelet formalism. The direc¬ 
tional correlation of F with 0 is obviously independent of the rotation angle X- As 
such, it reduces to a so-called standard correlation [236]: 

W|(^,a) = (0 fflD , a |F). (4.38) 

At each scale a, the wavelet coefficients identify a square-integrable function on § 2 
rather than on SO(3), which reads as an inverse spherical harmonic transform: 

K = IS (wJL (a) Y lm (co). (4.39) 

/GN \m\<l 

The spherical harmonic coefficients in this relation follow from relation (4.28) as 
the pointwise product of the spherical harmonic coefficients of the signal and the 
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wavelet: 

—-— / Ajr ^ 

(K) lm («) = V 2zTT (e «)/o^/»- (4.40) 

The reconstruction of F from its wavelet coefficients reads as 

F((0)= f dpi {a) [ daoWg (coo,a)[R(coo)L 0 0a](co), (4.41) 

JR* + JS 2 

for any scale integration measure dp (a), and with the operator Lq in L 2 (§ 2 , dQ) 
defined by L 0 G/O = Gio/C l Q . The reconstruction formula holds if and only if the 
analysis function satisfies the following admissibility condition for all l G N: 

0 < C' e = / dp (a) | (Q a ) m | 2 < oo. (4.42) 

2 / + 1 


4.3.2 Stereographic Wavelets 


4.3.2.1 Correspondence Principle 


When the stereographic dilation is considered, the effect of the dilation on the 
spherical harmonic coefficients of a function is not easily tractable analytically. Con¬ 
sequently, the admissibility condition (4.30) is difficult to check in practice. It can 
be shown that the nearly zero-mean condition 


1 

4 n 


L 


dQ 


co ) 

1 + cos 6 


= 0 


(4.43) 


is a necessary condition for wavelet admissibility. It is, however, formally not 
sufficient. On the contrary, wavelets on the plane are well known, and may be 
easily constructed, as the corresponding admissibility condition reduces to a 
zero-mean condition for a function that is both integrable and square-integrable. 
In that context, a correspondence principle was proved [235], stating that the 
inverse stereographic projection of a wavelet on the plane leads to a wavelet on 
the sphere. 

The stereographic projection is the unique radial conformal diffeomorphism 
mapping the sphere § 2 onto the plane M 2 . The unitary stereographic projection 
operators between functions G G L 2 (§ 2 , dQ) and g G L 2 (M 2 ,<i 2 x), and its inverse, 
respectively read as 


i nG \ ( x ) = (i + (0 ) G {n ‘ x )> 

[n“‘g] (co) = ^1 +tan 2 ^ g(nco), 


(4.44) 
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where co = (0, cp) still identify spherical coordinates on the sphere, and x = (r, cp) 
identify polar coordinates in the plane tangent to the sphere at the north pole. The 
azimuthal coordinates on the plane and on the sphere are identified to one 
another: cp. The radial conformal diffeomorphism between points is given as 
7r(0,<p) = (r(0),<p) for r(0) = 2tan(0/2), and its inverse reads /r _1 (r,<p) = (0(r),<p) 
for 0(r) = 2arctan(r/2). The diffeomorphism r(0) and its inverse 0(r) explicitly 
define the stereographic projection and its inverse. This stereographic projection 
maps the sphere, without its south pole, on the entire plane: r(0): 0 G [0, n[-^ [0, °o[. 
Geometrically, it projects a point co = (0, <p) on the sphere onto a point x = (r, cp) 
on the tangent plane at the north pole, colinear with co and the south pole (see 
Fig. 4.1). The prefactors in (4.44) are required to ensure the unitarity of the 
projection operators IJ and IJ~ l . 



Fig. 4.1: Stereographic projection n and its inverse n 1 , relating points (6. cp) on the sphere and 
(r, (p ) on its tangent plane at the north pole. The same relation holds through IT and IT -1 between 
functions living on each of the two manifolds, as illustrated by the shadow on the sphere and the 
localized region on the plane. (Figure borrowed from [235].) 

In this framework, the correspondence principle established states that if the 
function y/ e L 2 (M 2 ,d 2 x) satisfies the wavelet admissibility condition on the plane, 
then the function 

'F(o, ( p)=[n- l ¥ ](e,(p), (4.45) 

in L 2 (§ 2 , dQ), satisfies the wavelet admissibility condition (4.53) on the sphere. 
Notice that this correspondence principle requires the definition of a scale 
integration measure identical to the measure used on the plane: dil(a) = a~ 3 da. 
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This enables the construction of wavelets on the sphere by projection of wavelets 
on the plane. It also transfers wavelet properties from the plane onto the sphere. In 
particular, as the steerability of functions only depends on their behavior relative to 
the azimuthal variable (p, it is obviously preserved through stereographic projection, 
which only affects the function through its dependence in the polar angle 6 on the 
sphere or in the radial variable r on the plane. 


4.3.2.2 Example Filters 


For the sake of illustration, here we present axisymmetric, directional, and steerable 
example wavelets on the sphere in the context of the wavelet formalism with stere¬ 
ographic dilation. These wavelets are thus built as inverse stereographic projections 
of wavelets on the plane. 

The axisymmetric Mexican hat wavelet on the plane is defined as the normalized 
(negative) Laplacian of a Gaussian ^ _ ( x2 +> ;2 )/ 2 . Its inverse stereographic projection 
defines the axisymmetric Mexican hat wavelet on the sphere (see Fig. 4.2). The ellip¬ 
tical Mexican hat wavelet is a directional modification of the axisymmetric Mexican 
hat, obtained by considering different widths G x and G y , respectively, in the x and 
y directions on the plane for the original Gaussian [187]. The wavelet obtained as 
the inverse stereographic projection of the (negative) Laplacian of this Gaussian is 
proportional to (see Fig. 4.2) 


(0,(p) oc I l -f tan' 


, 4 tan 2 0/2 (of , a 2 2 

af + a 2 l of r a 2 


(4.46) 


x e 


—2tan 2 j (cos 2 (p/c 2 +sin 2 <p/(T 2 ) 


One can identify the wavelet parameters through the eccentricity of the ellipse 
defined by the points where the wavelet has zero value (zero-crossing), 
£ = (1 — (dx/cTy) 4 ) 1 / 2 (for G y > G x ), and the sum s = o ' 2 + a 2 . It is alternatively 
described by the ratio of the semimajor and semiminor axes of the Gaussian 
r = a x / G y , and the sum s = <7 2 + cr 2 . The axisymmetric Mexican hat is recovered 
for G x = G y = 1, in which case r = 1 (e = 0), and s m2. 

The real Morlet wavelet on the plane is another typical example of a directional 
wavelet. Its inverse stereographic projection (see also [64, 187] for similar projec¬ 
tions) on the sphere is proportional to (see Figure 4.3) 


*F(0,<p)°c 


^ 1 + tan 2 



COS 


|"kVx)j 



■2 tan 2 (6/2) 


(4.47) 


with 7T _1 x = (2tan(0/2)cos(p,2tan(0/2) sirup) in Cartesian coordinates. The 
arbitrary wave-vector k = (k x ,k y ) controls the direction and frequency of oscilla¬ 
tion of the wavelet (k 2 = & 2 + k 2 ). Notice that for |k| =2, the real Morlet wavelet 
closely approximates at large scales the second Gaussian derivative described in the 
following. 
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Fig. 4.2 : Mexican hat wavelet on the sphere for a dilation factor a = 0.4 and different eccentricities. 
On the left, the axisymmetric Mexican hat: r = 1 (£ = 0) and s = 2 (left). At the center and on the 
right, respectively, the elliptical Mexican hat for r = 0.5 (e ~ 0.96825) and s = 2, and r = 0.1 
(£ = 0.99995) and s = 2. Dark and light regions respectively identify negative and positive values. 
(Figure borrowed from T2371.) 



Fig. 4.3 : Real Morlet wavelet on the sphere for a dilation factor a = 0.4 and a wave-vector k = (6,0) 
on the left, and for a dilation factor a = 0.4 and a wave-vector k = (2,0) on the right. Dark and 
light regions respectively identify negative and positive values. (Figure borrowed from [237].) 


Derivatives of order N — 1 in direction x of radial functions on the plane are 
steerable wavelets. Their inverse stereographic projection thus defines steerable 
wavelets on the sphere. They have an azimuthal band limit equal to N and may be 
rotated in terms of M = N basis filters. We give explicit examples of the normalized 
first and second Gaussian derivatives. A first derivative has a band limit N = 2 and 
only contains the frequencies m = {± 1}. It may be rotated in terms of two specific 
rotations at % = 0 and % = 7t/ 2, corresponding to the inverse projection of the first 
derivatives in directions x and y, and respectively: 


(co) 


(co) cosx + (co) sin%. 


(4.48) 


The normalized first derivatives of a Gaussian (see Fig. 4.4) in directions x and y 
read 


*F^(0,<p) = Y“ (l+tan 2 ^ ^tan^cos (p^j e 2tan2 ( 0 / 2 ) 

*F d y (0,(p) = ^1 + tan 2 ^tan ^ sin<p^ ^ _2tan2 ( 0 / 2 ). (4.49) 
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Fig. 4.4: First Gaussian derivative wavelet on the sphere for a dilation factor a = 0.4: from left 
to right, *F d y, and rotation by % = n/A of Dark and light regions respectively identify 
negative and positive values. (Figure borrowed from [235].) 


A second derivative has a band limit N = 3 and contains the frequencies 
m = {0, ±2}. It may be rotated in terms of three basis filters. It indeed reads in 
terms of the inverse projection of the second derivatives in directions x and y, 

XI -\ 

and W y , respectively, and the cross derivative W a *°y as 


rHx)^ 


(co) = V 9 * (co) cos 2 x - 


- W 9 * {(») sin 2 x + (ffl) sin 2x- (4.50) 


The correctly normalized second derivatives of a Gaussian (see Fig. 4.5) in 
directions x and y read 


W d z (0, (p) = \l ( 1 + tan 2 ^ I ( 1 — 4tan 2 ^ cos 2 (p 


3k 


e 


-j / 


W y (0, (p) = \ — ( 1 + tan 2 —111— 4tan 2 — sin 2 <p e 


*p^(e,<p) = - 


3n 

4 (i 


V 


-tan 


2 

2 S 


-2 tan 2 (0/2) 


2 2 sin 2 f „W-2tW(e/2) 


20 . „ . 
tan — sin2<p ] e 


■2 tan 2 (0/2) 


(4.51) 


43. J Kernel Wavelets 

4.3.3.1 Harmonic Dilation Case 

When the harmonic dilation is considered, the analysis function ¥ must satisfy the 
following form of the admissibility condition (4.30). As a first constraint, one has 
Too = Tf) (0) = 0, which corresponds to the requirement that has a zero mean on 
the sphere: 
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Fig. 4.5 : Second Gaussian derivative wavelet on the sphere for a dilation factor a = 0.4: from left to 

right, , *F d y , x F d * d y, and below, rotation by % = n/A of . Dark and light regions respectively 
identify negative and positive values. (Figure borrowed from [235].) 


This zero mean is of course preserved through harmonic dilation. As the zero 
frequency is not supported by the wavelets, only signals with zero mean can be 
analyzed in this formalism [see relation (4.28)]. Notice that the scale integration 
measure can arbitrarily be chosen as dfl(a ) = a~ l da. This leads to a simple expres¬ 
sion of the remaining constraints for / G N° as 

0 < ^ = STI £ L T W |2 ■= -■ <4 ' 53) 

The left-hand-side inequality implies 0 < dk'/Id | {k')\ 2 for at least one of the 

first two generating functions: mo G {0,1}. In other words, either % or must be 
nonzero on a set of nonzero measure on M + . The right-hand-side inequality implies 
J R+ dk , /k , \'F m {k')\ 2 <oo for all generating functions: m G Z. Hence, the generat¬ 
ing functions must satisfy *f^(0) =0 [this condition encompasses the zero-mean 
condition (4.52) in the form ^(0) = 0] and tend to zero when k f —> ©o. With this 
choice of scale integration measure, the constraints summarize to the requirement 
that each generating function satisfies a condition very similar to the wavelet admis¬ 
sibility condition [235, 4] for an axisymmetric wavelet on the plane defined by a 
Fourier transform identical to W m (k) . Consequently, the wavelet admissibility con¬ 
dition (4.53) can be checked in practice and wavelets associated with the harmonic 
dilation can be designed easily. 
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For continuous axisymmetric wavelets, a unique generating function 0o(k) of a 
continuous variable k G M+ is required. The admissibility condition (4.42) reduces 
to the following expression. The analysis function 0 must have a zero mean and 
only allows the analysis of signals with zero mean. A unique additional condition 
holds independently of /: 


0 < c @ = [ ^r\0o (*') I 2 < (4.54) 

J M-(- & 

This condition actually encompasses the zero-mean condition in the form 0 q(O) = 0 
and also requires that the generating function must tend to zero when k' —> ©o. The 
coefficients entering the reconstruction formula (4.41) read as C l Q = 47tC0 /(21 + 1), 
for / G N°. 

However, as discussed in Section 4.2.3, the evolution of the localization and 
directionality properties of functions in real space through harmonic dilation are 
not explicitly controlled. These requirements are met in the context of the kernel 
dilation. Consequently, in the following we describe the continuous wavelet for¬ 
malism based on this kernel dilation. Moreover, this dilation renders the transition 
between the continuous and scale-discretized formalism much simpler and more 
transparent than what the harmonic dilation can provide. 


4.3.3.2 Kernel Dilation Case 

When the kernel dilation is considered, factorized steerable functions G L 2 (§ 2 , 
dQ) with compact harmonic support must be used: 

% m =K w {l)SJ m , (4.55) 

for a continuous kernel defined by a positive real function Ky(k) G M+ and a 
directional split defined by the directionality coefficients Sj m . The compact 
harmonic support of the wavelet in the interval l G ([ a~ l B\ ,B) is ensured by a 
kernel Ky(k) with compact support in the interval k G (a -1 B,B), with a compact¬ 
ness c(a) = a/(a — 1) G [1,°°): 

K w (k) =0 for k £ (a~ l B,B). (4.56) 

The steerability of a wavelet with an azimuthal band limit N is ensured by the direc¬ 
tional split: 

Sj m = 0 for all /,m with \m\>N, (4.57) 

with 

I \sL\ 2 = h (4.58) 

\m\ <min(A/—1,/) 

for all l G N°. Continuous axisymmetric wavelets 0(6) with compact harmonic 
support are simply obtained by the trivial directional split with Sf m = 8 m o for all 
ZG N°. 


152 


Pierre Vandergheynst and Yves Wiaux 


The analysis of a signal F G L 2 (§ 2 , dQ) with the analysis function *¥ gives the 
wavelet coefficients Wf, (p, a ) at each continuous scale a , around each point (Oq, and 
in each orientation %, through the directional correlation (4.1). The reconstruction of 
F from its wavelet coefficients results from relation (4.29). The zero-mean condition 
(4.52) for the admissibility of implies Ki?( 0) = 0. One can also arbitrarily set 
Sqq = 0. The admissibility condition (4.53) summarizes to 



(4.59) 


which actually also encompasses the zero-mean condition. The coefficients entering 
the reconstruction formula are Clp = 8 n 2 Cy / (21 + 1) for l G N°. In other words, the 
kernel must formally be identified with the Fourier transform of an axisymmetric 
wavelet on the plane. 

Notice that for a factorized wavelet *F, the directional correlation defining the 
analysis of a signal may also be understood as a double correlation, by the kernel 
and the directional split successively. The standard correlation (4.38) of the signal 
F and the axisymmetric wavelets defined by the kernel of W provides intermediate 
wavelet coefficients W? (coq, a) on § 2 at each scale a G M+. The spherical harmonic 
transform of these coefficients reads as 



(4.60) 


At each scale a , the directional correlation of the intermediate signal Wf ((Do, a) and 
a directional wavelet defined by the directional split of W provides the final wavelet 
coefficients on SO(3): 



(4.61) 


This reasoning obviously holds independently of the steerability or compact 
harmonic support properties of x ¥. 

Let us finally emphasize that even though the steerability of the wavelet is ini¬ 
tially set by the directional split, this steerability may be affected through kernel 
dilation due to the modified compact harmonic support associated with the dilated 
kernel. However, the computational advantage in relation (4.32) introduced by the 
existence of an azimuthal band limit in the definition of the directional split is pre¬ 
served through kernel dilation. 


4.3.4 Discretization of Variables 


The directional correlation defining wavelet coefficients at each position, 
orientation, and scale in (4.1) requires integration of functions on a continuous 







4 Wavelets on the Sphere 


153 


variable on § 2 . The reconstruction of the signal at each point on § 2 from its wavelet 
coefficients in (4.29) also requires integration on a continuous variable on SO(3), as 
well as on the dilation parameter on . Practical implementations must obviously 
be based on a choice of discretization for each of these variables. 

On the one hand, adequate pixelizations of the parameter spaces of the sphere 
and the rotation group may be designed for approximate or even exact integration on 
co G § 2 or p G SO(3) by finite weighted summations, generically called quadratures. 
Quadrature rules obviously rely on the fact that the spaces of integration are com¬ 
pact. On the other hand, continuous scale integration on a G may not be suitably 
approximated by quadrature rules. As a consequence, in the framework of a wavelet 
formalism relying on a continuous dilation parameter, signals may be analyzed at 
specifically chosen scales by computing of the corresponding wavelet coefficients, 
but reconstruction is not accessible in practice. In the next section, we describe fast 
and potentially exact algorithms for the analysis of a signal with a wavelet in that 
context. The definition of a wavelet formalism in which the dilation operation relies 
on a discrete dilation parameter is a necessary condition in order to reach in practice 
signal reconstruction from wavelet coefficients. Applications such as denoising or 
deconvolution with wavelets obviously require a discrete formalism where the sig¬ 
nal under scrutiny may in practice be reconstructed after modification of its wavelet 
coefficients. Such a discrete wavelet formalism is discussed later in this chap¬ 
ter, along with a corresponding fast and exact algorithm for both analysis and 
reconstruction. 


4.4 Analysis Algorithms 

In this section, we discuss choices of pixelizations on the sphere and on the rotation 
group, and describe two fast algorithms for the analysis of signals in the context of 
the continuous wavelet formalism developed. 


4.4.1 Pixelization 

4.4.1.1 Sampling Theorems 

We generally consider band-limited signals. Any function G G L 2 (S 2 ,dQ) is said 
to be band-limited with band limit B , for any B G N°, if G/ m = 0 for all /,m with 
/ > B. Any function H G L 2 (SO(3 ),dp) is said to be band-limited with band limit 
B , for any B G N°, if H l mn = 0 for all with l > B. From relation (4.28), if the 
signal F or the wavelet ¥ is band-limited on § 2 , then the wavelet coefficients 
are automatically band-limited on SO(3), with the same band limit B. A continu¬ 
ous band-limited signal F G L 2 (§ 2 , dQ) and a continuous wavelet W G L 2 (§ 2 , dQ) 
are respectively identified by the &{B 2 ) spherical harmonic coefficients F/ m and 
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with l G N, m G Z, and \m\ < l. A simple extrapolation of the Nyquist-Shannon 
theorem on the line would suggest that the same number &(B 2 ) of sampled 
values F(cOi) on points C0i G § 2 is required in order to describe the signal 
completely. The index i simply identifies the points of the sampling. Typically, 0(B) 
values are required for both samplings in 6 and <p. The directional correlation asso¬ 
ciated with the wavelet coefficients at each scale G L 2 (SO(3 ),dp) is iden¬ 

tified by &(B 3 ) Wigner /^-function coefficients (W£) mn (a) with l G N, m,n G Z, 
and \m\ M\ < l - A similar Nyquist-Shannon extrapolation would also suggest that 
a number @(B 3 ) of sampled values W£(pi,a) on points p ; - G SO(3) are required 
for the exact description of each directional correlation. Again, 0(B) values are 
required for samplings in Oo, (po, and %. 

These considerations raise the question of the choice of pixelization of § 2 on 
which the original signal should be sampled in order to provide precise computation 
of the required directional correlation at each scale. The same question is raised 
for the choice of pixelization of SO(3) on which the wavelet coefficients should be 
computed so that the continuous counterpart of the sampled values is known. 

On the sphere § 2 , 2B x 2B equiangular pixelizations are defined on points 
C0ij = ( Oi,q>j ) for 0 < z, j < 2B — 1, with a uniform discretization of the coordi¬ 
nates: AS = 6i + 1 — Oi = tc/2B and Acp = <p 7+ i — cpj = 2n/2B. The specific choice 
Oo = n/4B and cpo = 0 can be made for convenience. It gives Si = (2/+ l)n/4B 
and (pj = 2 jn/2B, and excludes the poles of the sampling, which can be convenient 
for numerical reasons. The pixel centers are identified with the sampling points un¬ 
defined above. The pixel’s edges are identified by meridians shifted by A 6/2 = 
tc/4B, and parallel shifted by Acp/2 = 2n/4B relative to cotj. The poles therefore 
appear as pixel corners. Equiangular pixelizations enjoy the so-called iso-latitude 
property; i.e., the sampling in 6 is independent of (p, of interest for computational 
purposes discussed below. Also, notice that the pixel area varies drastically with 6 
as sin 6d6d(p. 

A sampling theorem ensures that exact quadrature rules for integration of signals 
with band limit B on § 2 exist on 2 B x 2 B equiangular grids [76]. This sampling 
theorem represents a generalization of the Nyquist-Shannon theorem on the line. 
One way of stating it is to say that the spherical harmonic coefficients of a band- 
limited function on the sphere may be computed exactly up to a band limit B , 
through an equi-angular sampling, as a finite weighted sum, i.e., a quadrature, of 
the sampled values of that function [76]. The weights are defined from the structure 
of the Legendre polynomials Pi (cos 6) on the interval [0, n \. 

Other pixelization schemes widely used in astrophysics and cosmology may be 
considered. The HEALPix pixelization 1 (Hierarchical Equal Area iso-Latitude Pix¬ 
elization) [105], and the GLESP pixelization 2 (Gauss-Legendre Sky Pixelization) 
[75, 74] are two major examples. GLESP pixelizations are defined by a sampling 
of 6 on the roots of the Legendre polynomials of some order related to B , and by 
an equiangular sampling on <p for each value of 6. This scheme provides pixels of 


1 http://healpix.jpl.nasa.gov/ 

2 http://www.glesp.nbi.dk/ 
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nearly equal areas. A sampling theorem also ensures exact quadrature rules on such 
pixelizations. HEALPix pixelizations are defined through a hierarchical pixeliza- 
tion scheme, containing 12N 2 ide pixels of exactly equal areas at resolution A s id e = 2^ 
with ieN. Only approximate quadrature rules exist, which nevertheless can be 
made very precise thanks to an iteration process. 

On the rotation group SO(3), pixelizations may, for instance, be defined by com¬ 
bining pixelizations on § 2 with an equiangular sampling of %. A sampling theorem 
exists on the pixelizations based on equiangular and Gauss-Legendre pixelizations 
on § 2 , hence providing exact quadrature rules for the integration of band-limited 
signals on SO(3). Again, quadrature rules for pixelizations based on HEALPix pix¬ 
elizations are approximate. This extension basically relies on the separation of the 
integration variables [180, 179, 152] from relation (4.9). The sampling theorem 
notably ensures that the Wigner /^-function coefficients of a band-limited func¬ 
tion on SO(3) may be computed exactly by quadrature up to a band limit B. 
A corresponding choice of pixelization for computation of the wavelet coefficients 
at each scale a will thus provide exact knowledge of the corresponding 
function in L 2 (SO(3),dp). 


4.4.1.2 A Priori Computational Complexity 

The computational complexity of an algorithm, or of a part of it, may be defined 
as the number of basic summation or multiplication operations required to obtain 
the result from initial data. Each two-dimensional scalar product on § 2 required in 
the directional correlation relation (4.1) may a priori be computed by quadrature 
with computational complexity &(B 2 ) on &(B 2 ) positions and @(B) orientations, 
at each analysis scale a. The computation would be exact on those pixelizations 
where a sampling theorem holds. However, such a @(B 5 ) complexity appears to be 
absolutely unaffordable for fine samplings on the sphere with B > 10 3 . The defini¬ 
tion of fast analysis algorithms is consequently essential. 


4.4.2 Fast Algorithms 

4.4.2.1 Separation of Variables 

Let us consider a function G E L 2 (§ 2 ,<if2) with band limit B and given in terms of 
its sampled values G(cOi) on the @(B 2 ) discrete points C0i of the chosen pixeliza¬ 
tion of § 2 . The factorized form (4.2) of the spherical harmonics naturally enables 
one to compute a direct spherical harmonic transform by quadrature through sepa¬ 
ration of the integrations on the variables 6 and <p. Conversely, an inverse spherical 
harmonic transform may be computed as successive summations on the indices / 
and m, up to the band limit B. Correctly ordering the corresponding operations pro¬ 
vides a calculation of direct and inverse spherical harmonic transforms in &(B 3 ) 
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operations [76]. This separation of variables in the spherical harmonic transforms 
simply requires iso-latitude pixelizations on the sphere. The computation of the 
spherical harmonic coefficients of a band-limited function is theoretically exact only 
on those pixelization where a sampling theorem holds. The inverse spherical har¬ 
monic transform is of course exact independently of the pixelization as it is a simple 
finite sum truncated by the band limit. 

Let us consider a function H G L 2 (SO(3), dp) with band limit B and given in 
terms of its sampled values H(pi) on the @(B 3 ) discrete points p* of the chosen 
pixelization of SO(3). The factorized form (4.9) of the Wigner D-functions enables 
the computation of a direct Wigner /^-function transform by quadrature through 
separation of the integrations on the variables 0o, (po, and %. Conversely, an inverse 
Wigner /^-function transform may be computed as successive summations on the 
indices /, m, and n , up to the band limit B. Correctly ordering the corresponding 
operations provides a calculation of direct and inverse Wigner D-function transform 
in &(B 4 ) operations [180, 179]. Again, iso-latitude pixelizations are required for 
00 and (po on the sphere, while the separation of variable may be performs for any 
structure of the sampling in the third Euler angle %, potentially depending on 0 q 
and <po- The computation of the Wigner D-function coefficients of a band-limited 
function is theoretically exact only on those pixelizations where a sampling theorem 
holds. The inverse Wigner D-function transform is of course exact independently of 
the pixelization as it is a simple finite sum truncated by the band limit. 

In this context, relation (4.28) provides a simple expression for the Wigner 
D-function coefficients of the directional correlation defining the wavelet coeffi¬ 
cients w£(p,a) on SO(3) for a signal F with a wavelet X F, at each analysis scale 
a. This relation is essential to avoid the large &(B 5 ) computational complexity for 
the wavelet coefficients through simple quadrature in real space. A corresponding 
harmonic space algorithm can be designed. The band-limited signal F is given in 
terms of its sampled values F(cOi) on the &(B 2 ) discrete points C0i of the chosen 
pixelization of § 2 . The algorithm first computes required direct spherical harmonic 
transforms (4.7) for F/ m and second performs the pointwise product (4.28), 

and finally computes the inverse Wigner D-function transform (4.27) to obtain the 
wavelet coefficients. The resulting band-limited coefficients Wy(p,a) at each scale 
a are given in terms of sampled values W£(pi,a) on the @(B 3 ) discrete points pi 
of the chosen pixelization of SO(3). Using the separation of variables, the compu¬ 
tational complexity of the spherical harmonic coefficients is of order &{B 3 ). The 
computational complexity of the pointwise product (4.28) is also of order &(B 3 ). 
Again using the separation of variables, the computation complexity of the inverse 
Wigner D-function transform is of order &(B 4 ), which consequently sets the overall 
computational complexity of the algorithm. 

Consequently, a harmonic space algorithm relying on the separation of variables 
on iso-latitude pixelizations on the sphere allows computation of a directional cor¬ 
relation in &(B 4 ) instead of &(B 5 ) operations. The exactness of the computation 
relies only on the exactness of computation of the spherical harmonic coefficients 
of the signal and the wavelet, which depends on the existence of a sampling theorem 
on the pixelization chosen. 
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4.4.2.2 Factorization of Rotations 

A second fast algorithm for the directional correlation of a band-limited signal F 
with a band limit B and a wavelet *¥ may be designed through the technique of 
factorization of three-dimensional rotations detailed below. 

The rotation operator R (p) on functions in L 2 (§ 2 , dQ ) may be factorized as [204, 

231, 188] / n n \ / 7 i 7 i\ 

R((Po,d 0 ,x) = R[(Po-^,-^,do)R(o,-,x + ^) • (4.62) 

Let us recall that the Wigner D-functions are the matrix elements of the opera¬ 
tor R(p). The factorized form (4.9) thus provides an alternative expression for the 
wavelet coefficients at scale a relative to the explicit inverse Wigner D-function 
transform (4.27) as 

W?(p,a)= £ (R%\F) mm > n e i(m,po+m%+nx) . (4.63) 

\m\,\m'\,\n\<B 

The coefficients in this relation read as 

(R%\F) mm >n = e iin - m)n/2 (f) d l m , n (|) (%fA, (4-64) 

where C = max( | m |, | m! |, | n |), and where the symmetry relation d l m , m ( 6 ) = d l mm , (—0) 
was used. 

In this context, a harmonic space algorithm must firstly compute required spheri¬ 
cal harmonic transforms for F/ m and (^) //z , secondly perform the summation (4.64), 
and finally compute the inverse transform (4.63) to obtain the wavelet coefficients. 
The band-limited signal F is given in terms of its sampled values F(cot) on the 
&(B 2 ) discrete points C0i of the chosen pixelization of § 2 . Considering an iso¬ 
latitude pixelization for 6 and (p , the spherical harmonic transforms may be com¬ 
puted by quadrature in €?(B 3 ) operations through separation of variables in the 
spherical harmonics. The computational complexity of the summation (4.64) for all 
required values of m, m', and n is of order &(B 4 ). Again considering an iso-latitude 
pixelization for and (po, the computation of the inverse transform (4.63) may be 
performed by quadrature in &(B 4 ) operations through separation of variables in the 
imaginary exponentials, instead of an explicit separation of variables in the Wigner 
D-functions themselves. Again, the resulting band-limited coefficients W£(p,a) at 
each scale a are given in terms of sampled values W^(p^a) on the @(B 3 ) discrete 
points pi of the chosen pixelization of SO(3). 

Consequently, a harmonic space algorithm relying on the factorization of rota¬ 
tions on iso-latitude pixelizations on the sphere also allows computation of a direc¬ 
tional correlation in &(B 4 ) operations, now driven by the intermediate summation 
(4.64) and the inverse transform (4.63), instead of &(B 5 ) operations. The exactness 
of the computation again relies only on the exactness of the computation of the 
spherical harmonic coefficients of the signal and the wavelet, which still depends on 
the existence of a sampling theorem on the pixelization chosen. 
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Notice that, while the Euler angles <po and % are in the range (po,X £ [0, 27t), the 
original range for 0q is 0q G [0, n\, in order to cover the parameter space of SO(3). 
A formal extension of this interval Oo G [0,2 n) provides the parameter space of the 
three-torus T 3 , covering twice the parameter space of SO(3). In this context, the 
relation (4.63) represents a three-dimensional inverse Fourier transform, which can 
be calculated in G (. B 3 log 2 B) operations on a 25 x 25 x 2 B equiangular grid on T 3 
by the use of the standard Cooley-Tukey fast Fourier transform (FFT) algorithm. 
This optimization, however, does not reduce the overall computational complexity 
for the directional correlation, still driven by the summation (4.64). 


4.4.2.3 Steerable Optimization 

Steerable wavelets may be used for further optimization of the algorithmic 
complexity. Wavelets W are considered with a small number M of basis functions 
forO — 1 in (4.31), which may actually represent rotations of a unique 

filter in M basis orientations in (4.35). In other words, the azimuthal band limit 
N of the wavelets is small relative to the overall band limit B : M,A <C B. The 
directional correlation with a steerable wavelet reduces to a linear combination of 
M standard correlations (4.33) with the basis functions. The computational com¬ 
plexity of a directional correlation reduces to that of M standard correlations, with 
total a priori computational complexity of order G(MB 4 ), to which is simply added 
the G(MB 3 ) linear combination that arises from (4.32). Either the technique of 
separation of variables or the factorization of rotations can be applied to the stan¬ 
dard correlation on iso-latitude pixelizations on the sphere, by setting % = 0 in the 
relation (4.27) or (4.63), respectively. The computational complexity count for the 
index n, with \n\ < TV, is also reduced from B down to M. It readily appears that the 
corresponding computational complexity for the two algorithms hence reduces to 
G{M 2 B 3 ) rather than G(MB 4 ) for the directional correlation. When the basis func¬ 
tions actually represent rotations of a unique filter in M basis orientations in (4.35), 
coefficients in all basis orientations may be computed at once, hence reducing the 
overall computational complexity for the directional correlation to G(MB 3 ). The 
steerable optimization thus renders the computation more easily affordable, even 
when multiple signals and multiple scales are considered. 

Details on the algorithmic structure, computation times, memory requirements, 
and numerical stability of the corresponding implementations on HEAEPix and 
equiangular grids on the sphere may be found in [188] for the factorization of 
rotations, and in [236] for the technique of separation of variables and the opti¬ 
mization with steerable wavelets. Further possible optimization of the algorithm 
to a computational complexity of order G(M 2 B 2 log 2 B) can formally be reached 
through separation of variables on equiangular pixelizations on the sphere. It notably 
relies on the Driscoll and Healy algorithm for fast spherical harmonic transform 
[76, 134, 133, 236]. 

Considering an axisymmetric wavelet 0, the azimuthal band limit is reduced to 
N = l: 0i n = 0 for \n\ > 1. The proper rotation by % has no effect on the filter, and 


4 Wavelets on the Sphere 


159 


the directional correlation reduces to a unique standard correlation. At each scale, 
the wavelet coefficients of a signal with an axisymmetric filter live on the sphere § 2 . 
On iso-latitude pixelizations, the direct spherical harmonic transforms for F/ w and 
(0 fl ) /o , the pointwise product (4.40), and the inverse spherical harmonic transform 
(4.39) can simply be computed by separation of variables in the spherical harmonics 
(4.2). This provides an algorithmic structure with &(B 3 ) asymptotic complexity, 
which again can formally be reduced to @(B 2 log 2 B) on equiangular pixelizations. 


4.5 Discrete Formalism 

In this section, we define a discrete wavelet formalism following from a discretiza¬ 
tion of the scales in the continuous formalism based on the kernel dilation. We also 
comment on other possible constructions of a discrete wavelet formalism. 


4.5.1 Discrete Wavelets 

4.5.1.1 Scale Discretization 

In the context of the wavelet formalism relying on the kernel dilation, 
scale-discretized wavelets r can simply be obtained from continuous factorized 
steerable wavelets with compact harmonic support, through an integration by slices 
of the dilation factor a G . Through this transition procedure, scale-discretized 
wavelets remain factorized steerable functions with compact harmonic support and 
are dilated through the same kernel dilation. 

We consider the analysis of a signal F G L 2 (S 2 ,df2) with band limit B. 
The original continuous wavelet *¥ G L 2 (§ 2 , dQ) with a compact support is defined 
in the interval k G (a~ 1 B,B). The value a > 1 regulates the compactness c(a) of *F. 
It is also taken as a basis dilation factor. The discrete dilation factors for the 
scale-discretized wavelet will correspond to integer powers a 7 , for analysis depths 
je N. 

The scale-discretized wavelet r G L 2 (§ 2 , dQ) is thus defined in factorized form: 

f lm = K r {l)Sf m , (4.65) 

for a scale-discretized kernel defined by a positive real function Kp (k) G M+ and a 
directional split defined by the directionality coefficients Sf m . The directional split 
of r is identified with the split of 

SL=SL (4-66) 

also giving 

Sf m = 0 for all l.m with \m\>N, (4.67) 
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and 

I |Sfj 2 = l, (4.68) 

|m|<min(Y— 1,/) 

for l G N°, while Sqq = 0. In this relation, N stands for the azimuthal band limit of the 
steerable wavelets. The identification of the directional splits ensures that the same 
steerability properties are shared by the continuous wavelet and the scale-discretized 
wavelet. The scale-discretized kernel Kr ( k ) is obtained from the continuous ker¬ 
nel Ky{k) through an integration by slices of the dilation factor a G of the 
continuous wavelet formalism. 

As a first step, a positive real scaling function Or(k) £ of a continuous 
variable k G M+ is defined that gathers the largest dilation factors a G (1, °°), or cor¬ 
respondingly the lowest values of k. This generating function reads for k G as 



a ' C W J(, 


CV J(a- l B,B)n(k,c 




(4.69) 


and is continuously extended at k = 0 by Op (k) = 1. The scaling function Or (k) 
therefore decreases continuously from unity down to zero in the interval 

k G 

Or (k) = 1 for 0 < k < a~ l B , 

O r {k)e( 0,1) for a ~ l B<k<B, 

O r {k) = 0 for k>B. (4.70) 


Notice that similar procedures of scale integration by slices were already proposed 
in the development of corresponding formalisms for directional wavelets on the 
plane [78, 190, 228], and in the particular case of axisymmetric wavelets on the 
sphere [97]. 

As a second step, a simple Littlewood-Paley decomposition [95] is used to define 
the scale-discretized kernel Kr ( k ) by subtracting the scaling function Or ( k ) to its 
contracted version Or(oc~ l k). This implicitly sets the value a as the basis dilation 
factor. The scale-discretized kernel also reads as an integration of the continuous 
kernel over a slice a G (a -1 ,1) for the dilation factor, or equivalently over a slice 
k G (a -1 #,#) fi (a~ l k,k) of the compact support interval: 


Kl (k) = Ol [a~ x k) - Ol (k) 


1 

C*r 


[' = -L [ 

J<x~ l a CV J ( a - 1 


C*F n(or 


l k,k) k! 


Kq, (k r ). (4.71) 


The scale-discretized kernel therefore has a compact support in the interval 
k e ( a~ 1 B,aB ): 

K r (k) =0 for k £ (a _1 B, aB). (4.72) 

This support is wider than that for the original continuous kernel and the scaling 
function. The corresponding compactness reads as c(a 2 ) = a 2 /(a 2 — 1) G [1,°°). 
The compact harmonic support of the scale-discretized wavelet F itself is thus 
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defined in the interval l e ([a l B\ , \aB]). The kernel also satisfies K^( 0) = 0, 
leading to a scale-discretized wavelet F with a zero mean on the sphere: 

— [ dnr(co)= 0. (4.73) 

4k 7§2 


The dilations by a 7 of the scale-discretized wavelet are defined by the 
kernels Kr(oc^k) for any analysis depth j E N. Each kernel has compact support 
in the interval k E (a _ ( 1+ A0, gc( 1_ Ab) and exhibits a maximum at k = a~^B, with 
Kr aj (cc~^B) = 1. The scale-discretized wavelet at each analysis depth j thus 


has a compact harmonic support in the interval l E 


a 




a 


(i- 


J H)' 


The 


property Fj- (0) = 0 still ensures that each scale-discretized wavelet has a zero mean 
on the sphere. Notice that for j > 1, one gets a dilation factor strictly greater than 
unity > 1, and the scale-discretized wavelet has a band limit less than or equal 
to the assumed band limit B for the signal F to be analyzed. At j = 0, only the 
values of the kernel in the interval l E ( ,B) are of interest, as higher fre¬ 
quencies / are truncated by the signal F itself through the directional correlation. 
One can equivalently consider that the compact support of the kernel is restricted 
to k E (cc~ l B,B) in the definition of the scale-discretized wavelet at this first analy¬ 
sis depth j = 0. For j < — 1, the lower bound of the compact harmonic support of 
the scale-discretized wavelet is larger than the band limit B. The scale-discretized 
wavelets with negative analysis depths can therefore be discarded, as the result of 
their directional correlation with the signal F would be identically zero. 


4.5.1.2 Invertible Filter Bank 


The admissibility condition (4.59) for continuous wavelets simply turns into a 
resolution of the identity below the band limit by a set of dilated wavelets at various 
analysis depths j, with 0 <j< /, and a dilated scaling function at some total analy¬ 
sis depth / E N. This defines what one can call a filter bank on the sphere. One gets 
in particular for 0 < k = / < B: 

J 

( a J l ) + £ Kl ( a j l ) = 1. (4.74) 

7=0 


The scaling function values Or(oc J l) are equal to unity in the interval 


/ E 


0 , 




, then decrease in the interval / E ^ a , \ a J B~\ ^, and 

are equal to zero for l > \a~ J B ~\. The kernel values Kr(ocH) are nonzero only in 


the compact harmonic support interval l E ^ 


a 


~<4+j)B 


a 


1 ft B | ^. The scaling 


function typically retains the low-frequency part of the signal, which will not be 
analyzed. All signal information at frequencies l < is kept only in the 

scaling function, equal to unity. The wavelets are equal to zero at these frequencies. 
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All signal information at frequencies / > \a~ J B\ is fully analyzed by the wavelets, 
while the scaling function is equal to zero. Intermediate frequencies are also ana¬ 
lyzed by the wavelets, but the scaling function is required for the reconstruction of 
the corresponding signal information. 

Let us define the maximum analysis depth Jb(oc) as the lowest integer value such 
that a~ jB ^B < 1: 

Js(a)=\log a B]. (4.75) 

In a case where the total analysis depth would be chosen strictly above Jb{oc ), all 
wavelets at analysis depths j with J >./> Jb(oc) + 1 would be identically null, 
as their kernels have a compact support strictly included in the interval k G (0,1). 
The total analysis depth is consequently naturally limited by / < Jb(oc). In the case 
J = Jb{oc ), the dilated scaling function evaluated at ol J b ^I has a nonzero value only 
at l = 0, 0^ = <5/o, while all wavelets are equal to zero at l = 0 as they have 

a zero mean. Hence, the identity can be resolved with Jb(oc) +1 dilated wavelets and 
a trivial scaling function that simply retains the spherical harmonic coefficient Too 
out of the analysis, or equivalently the mean of the signal over the sphere. One gets 
in particular for 0 < k = l < B: 


J B (oc) 

<5/0+ ^ (cc j l) = 1- 


7=0 


(4.76) 


4.5.1.3 Analysis 


Following the scale discretization defining the wavelets T G L 2 (§ 2 ,df2), a new 
scale-discretized wavelet formalism is provided for the analysis and exact recon¬ 
struction of band-limited signals. 

The analysis of a band-limited signal F G L 2 (S 2 ,<if2) with band limit B , with a 
scale-discretized wavelet T, is performed by directional correlations just as in the 
continuous wavelet formalism. At each analysis depth j with 0 < j < J < Jb(oc), the 
wavelet coefficients I+/(p,a ; ) characterizing the signal around each continuous 
point coo = ( 0q , (po) G § 2 , and in each continuous orientation % G [0,2/r), are still 
defined by the directional correlation of F with the analysis functions r a i dilated 
through the kernel dilation by the dilation factor a+ 

Wf(p,ai) = (r p)a j\F), (4.77) 

with p = (<po, Oo,x)- Once more, at each analysis depth j , the wavelet coefficients 
read as an inverse Wigner D-function transform: 


wf(p, a j) 


1 

le N 


2/+1 

87T 2 


|m|,|»|<Z 


(4.78) 
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with the Wigner D-function coefficients 

—--4 . Stt 2 ——— * ^ 

(Wr F U («') = 27^f(^) A- (4-79) 

Again, the factorization relation (4.65) allows one to understand the directional 
correlation (4.79) as a double correlation, by the kernel and the directional split 
successively. 


4.5.1.4 Exact Reconstruction 


The reconstruction of the band-limited signal F from its wavelet coefficients reads 
in terms of a summation on a finite number /+ 1 of discrete dilation factors: 


F(co) 




’F}(co) + Z ( 

j =(4 s 


' SO(3) 


dpwf(p,a j ) 


R(p)L d r a j (co). (4.80) 


The approximation [0 a jF](co) accounts for the part of the signal retained in the 
scaling function Or(oc J l). I n a ver Y similar way to the part of the signal analyzed 
by the wavelets, it can be written as 


\&r 


jF] (ft)) = 2 nj dQ o < (fflo, a J ) 


R{coo)L d <P a j (®), 


(4.81) 


with W^(coo, a J ) = a /|E), and for an axisymmetric function & G L 2 (§ 2 , dQ) 

defined by (&r)i m = ^r(0^mO- In the particular case where J = Jb(oc ), one gets 

(&r)i m = <5/0<5 m o and the approximation simply reduces to the mean of the signal 
over the sphere: [Q> a j B {a)F\ = (4n)~ l J^idQF(co). The zero-mean signal is com¬ 
pletely analyzed by the scale-discretized wavelets. The operator L d in L 2 (§ 2 ,df2) 
in the present scale-discretized wavelet formalism is defined by the following action 
on the spherical harmonic coefficients of functions: L d Gi m = (21 + l)G/ m /87T 2 . This 
operator defining the scale-discretized wavelets L d r a j used for reconstruction is 
independent of T, contrary to the operator Ly for continuous wavelets. This simply 
comes from the fact that the scale-discretized wavelets are, through their definition 
(4.71), normalized by Cy. 

Just as in the continuous wavelet formalism where the admissibility 
condition (4.59) is required, the present reconstruction formula holds if and only 
if the scale-discretized wavelet satisfies the constraints (4.68), and (4.74) or (4.76). 
These constraints are automatically satisfied by construction of the scale-discretized 
wavelets through the integration by slices. Again, this corresponds to the require¬ 
ment that the wavelet family as a whole, including the scaling function, preserves 
the signal information at each frequency l G N. 
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Equivalently, the decomposition of the reconstruction relation (4.80) into 
spherical harmonic coefficients reads as a finite summation: 


F lm ~ [®a jF \lm + 


2/+1 

87T 2 


X X <Tai)m 

7=0 |n|<min(V—1,/) 


0 ¥) 


, (<* J ) 


(4.82) 


with 

[%/F] lm = (a'l) F lm - (4.83) 

Let us emphasize the fact that a finite number of discrete dilation factors 
are required for the analysis and reconstruction of a band-limited signal. Contrary 
to the case of the continuous dilation factors, this allows the exact reconstruction 
of band-limited signals from relation (4.80), on pixelizations of the sphere where a 
sampling theorem holds. 

4.5.1.5 Example Filter 

We describe a real scale-discretized factorized steerable wavelet r with compact 
harmonic support, designed through relations (4.65)-(4.71) from a real continu¬ 
ous wavelet *F, for a band limit B = 1024, a basis dilation factor a = 2, and an 
azimuthal band limit N = 3 of steerability. The function is imposed to be real and 
to be even both under rotation around itself by n and under a change of sign on <p. 
This corresponds to the constraints that only the T = 3 values m G {—2,0,2} are 
allowed and S[ m is real. We also impose that the directionality coefficients are inde¬ 
pendent of l for l > 2. This ensures that the directionality and steerability properties 
are preserved for the analysis depths for which the lower bound of the compact 
harmonic support is larger than 2. The continuous kernel is defined with a com¬ 
pact support in the interval k G (512,1024). The scaling function &r{k) for each 
analysis depth j is obtained by numerical integration, with nonzero values in the 
intervals k G (512/2- 7 ’, 1024/2- 7 ). The scale-discretized kernel Kp{k) then follows 
with nonzero values in the intervals k G (512/2 J ,2048/2 7 ). For j = 0, the corre¬ 
sponding compact support interval is cut at the band limit: k G (512,1024). For 
1 < 7 < 9, the intervals progressively move to lower frequencies and shrink. At 
the maximum analysis depth j = J B (a) = 10, the compact support is shrunk to 
k G (0.5,2) and the scale-discretized kernel only contains the frequency / = 1. A 
specific choice of directional split and continuous kernel is made under all these 
constraints, as in [238]. Corresponding graphs are reported in Fig. 4.6. 

Plots of the scale-discretized wavelet are also reported in Fig. 4.7. The wavelets 
are represented at the four largest analysis depths, 7 <j< 10, identifying the four 
largest scales. At j = 7, j m 8, and j = 9, the compact supports of the scale- 
discretized kernels respectively contain the frequencies / = 5 to / = 15 with a kernel 
maximum at / = 8, / = 3 to / = 7 with a kernel maximum at / = 4, and / = 2 to 
1 = 3 with a kernel maximum at / = 2. At j = 10, the scale-discretized kernel only 
contains the frequency 1 = 1. For the depths j with 7<7<9, the lowest frequencies 
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Fig. 4.6: Graphs of the continuous kernel (continuous red line), scaling function (dot-dashed blue 
line), and scale-discretized kernels (dotted black lines) for the example scale-discretized wavelet 
designed at a band limit B = 1024 and with a basis dilation factor a = 2. The scale-discretized 
kernels are plotted for the first five analysis depths j, with o < ./ < 4. (Figure borrowed from 
[238].) 


I are greater than or equal to 2 and the azimuthal frequency indices contained in the 
directional split are m E { —2,0,2}. These wavelets all have the same directionality 
and steerability properties. For the depth j = 10, the scale-discretized wavelet is a 
pure dipole (/ = 1). The azimuthal frequency index is restricted to m = 0 due to the 
constraint \m\ < /, ensuring that the harmonic structure on the sphere is respected, 
and the wavelet is simply axisymmetric. 


4.5.2 Other Constructions 

First, a scale-discretized wavelet formalism with relations (4.74) and (4.76) for 
factorized steerable wavelets with compact harmonic support can be developed by 
simply relying on a Littlewood-Paley decomposition, without any contact with the 
continuous wavelet formalism. One simply needs to choose any arbitrary scaling 
function satisfying relation (4.70) and define the corresponding scale-discretized 
kernels by differences of scaling functions at successive scales. 

Such invertible filter banks based on the harmonic dilation were already devel¬ 
oped in the case of axisymmetric wavelets [97, 216]. Our definition of factorized 
steerable wavelets with compact harmonic support allows a straightforward gen¬ 
eralization to directional wavelets with the kernel dilation. Also, notice that the 
constraints of steerability and compact harmonic support for the scale-discretized 
wavelets can technically be relaxed without affecting the Littlewood-Paley decom¬ 
position. However, both properties are essential for the control of localization and 
directionality properties through kernel dilation. Moreover, in the absence of com¬ 
pact harmonic support, relation (4.74) turns into a resolution of the contracted scal¬ 
ing function Or{oc~ l l), which differs from unity below the band limit. In other 
words, the filter bank developed in such a case analyzes the part of the signal 
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Fig. 4.7: Plots of the example scale-discretized wavelet designed with B = 1024, a = 2, and N = 3. 
The wavelets are represented at the four largest analysis depths: from left to right j = 7, j = 8, and 
j = 9, and below j = 10. Light and dark regions respectively correspond to positive and negative 
values of the functions (see value bars). The wavelets are neither translated, i.e., they have their 
central position at the north pole, nor rotated, i.e., they are in their original orientation % = 0 
(the meridian (p = 0 corresponds to a vertical line passing by the north pole). (Figure borrowed 
from [238].) 


corresponding to its standard correlation with the contracted scaling function, rather 
than the signal itself. In the absence of compact harmonic support and steerability, 
essential multiresolution properties are also lost (see Section 4.6.1). The mem¬ 
ory and computation time requirements of the algorithm for the analysis and 
reconstruction of signals therefore increase significantly and may rapidly become 
overwhelming. 

Second, also notice that scale-discretized axisymmetric wavelets with compact 
harmonic support and dilated through kernel dilation were introduced under the 
name of needlets [193, 11, 178]. It is possible to show that the needlet coefficients of 
a wide class of random signals on the sphere are uncorrelated in the asymptotic limit 
of small scales, at any fixed angular distance on § 2 . The scale-discretized steerable 
wavelets with compact harmonic support, thanks to their factorized form and to the 
choice of the kernel dilation, are also good candidates for a directional extension of 
needlets. 

Third, invertible filter banks based on the stereographic dilation have also 
recently been proposed [245], but they do not share these essential multiresolution 
properties. 

Finally, frames of stereographic wavelets have been constructed in [25] by direct 
discretization of the translation, rotation, and dilation parameters. Notice though that 
the obtained frames are not tight, which means that numerical reconstruction can 
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only be achieved by resorting to rather heavy algorithms that seek to approximate the 
pseudo-inverse of the forward analysis wavelet operator. By contrast, the formalism 
discussed here leads to an efficient and simple numerical scheme. 


4.6 Reconstruction Algorithm 

In this section, we identify the multiresolution properties of the scale-discretized 
wavelet formalism developed. We also describe a corresponding fast algorithm for 
the analysis and reconstruction of signals. 


4.6.1 Multiresolution 


We consider the analysis and reconstruction of a signal F G L 2 (§ 2 , d£2) with a scale- 
discretized wavelet F G L 2 (§ 2 , dQ), which is a factorized steerable function with 
compact harmonic support. The band limit and basis dilation factor of interest are 
denoted B and a > 1, respectively. 

The compact harmonic support of the scale-discretized wavelet T a j is reduced in 


the intervals / G< 



a 


1 through 


the kernel dilation at each 


analysis depth j. As a function on SO(3), the wavelet coefficients at depth j 
exhibit the same compact harmonic support as the scale-discretized wavelet T a j. 

From relation (4.79), the Wigner D-transform {Wf) mn ( ( x 2 ) of the wavelet 
coefficients is indeed nonzero only in the same interval as the wavelet. In partic¬ 


ular, the band limit of the wavelet coefficients is decreased to 


a 


(!- j)B 


at depth 


j. Consequently, the number of sampled values required for the wavelet coefficients 
is reduced at each increase of the analysis depths j to a 2 ^ l ~^ &(B 2 ) discrete points 
of the form (coo);(_/) on S 2 , where i(j) simply indexes these points. The number of 
operations required for their computation is reduced correspondingly. Hence, the 
kernel dilation applied to scale-discretized wavelets with compact harmonic sup¬ 
port provides a first strong multiresolution property for the formalism. The steer¬ 
ability of the wavelet is also important in the algorithmic structure of the analysis, 
beyond its interest in preserving directionality properties through kernel dilation. 
At each point (<jOo);(j) and at each analysis depth j , the wavelet coefficients of a 
signal F with the scale-discretized wavelet ^OCJ are known for all continuous rota¬ 
tion angles % G [0,2/r) as a linear combination of the wavelet coefficients of F with 
M basis wavelets, which can be taken as specific rotations F^ ,aJ °f the wavelet 
r aj , with 0 < p < M — 1. From this perspective, steerability provides a second 
strong multiresolution property for the formalism. 

In summary, when multiresolution properties of the formalism are fully 
accounted for, a reduced number of discrete points of the form pj^ = ((oq)^-) ,Zp) 
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on SO(3) are required for the sampled values Wf of the wavelet coeffi¬ 

cients, where I(j)={i(j),p} simply indexes these points at each analysis depth j. 


4.6.2 Fast Algorithm 

We describe here an algorithm for both analysis and reconstruction. As for the 
analysis algorithms described earlier in this chapter, it is designed in harmonic space 
in order to take advantage of the directional correlation relation (4.79). 

Some precalculations are required in order to build the wavelets F a i in real 
space, from the spherical harmonic coefficients of a continuous wavelet in 
relation (4.69). As it clearly appears in the following, the cost of these operations is 
negligible relative to the cost of the analysis and reconstruction themselves. More¬ 
over, it must be performed only once for all signals to be analyzed. 

The analysis may be performed by application of either the separation-of- 
variables algorithm or the factorization-of-rotations algorithm defined in Section 
4.4, at all analysis depths j separately. The band-limited signal F is given in terms 
of its sampled values F(cty) on the 0(B 2 ) discrete points C0i of the chosen pixeliza- 
tion of § 2 . The resulting band-limited coefficients W/(p,a 7 ) at each depth j are 
given in terms of sampled values Wf (p/^a- 7 ) on the a 2 ^ l ~^ &{MB 2 ) discrete 
points p/Q) of the chosen pixelization of SO(3). The exactness of the computation 
relies only on the exactness of the computation of the spherical harmonic coeffi¬ 
cients of the signal, which depends on the existence of a sampling theorem on the 
initial pixelization on which the original signal was sampled. The a priori compu¬ 
tational complexity for the directional correlation (4.77) by quadrature is of order 
a 4 (W) x @(MB a ) at each analysis depth j. This cost is reduced to a 3 ^~^ &(MB 3 ) 
with the steerable optimization of the fast analysis algorithms. 

The reconstruction part of the algorithm proceeds through the exact same 
operations as the analysis, in reverse order. The Wigner £>-function coefficients 

(Wf)mn(ccJ) of the wavelet coefficients are computed by quadrature through a 
direct Wigner D-function transform at each analysis depth j. The spherical harmonic 
coefficients of the reconstructed signal F/ m are then obtained as a finite summa¬ 
tion following from relations (4.82) and (4.83). The samples F((Di) of the signal 
are finally recovered by a simple inverse spherical harmonic transform. The recon¬ 
struction is symmetric to the analysis and therefore requires the same number of 
operations. 

The total computational complexity of the algorithm is obtained by summing 
over all analysis depths j with 0 <j<j- In the most exacting case where J = Jb(cc), 
it simply reads as 



(4.84) 


In this expression, the impact of the compact harmonic support of the scale- 
discretized wavelet is concentrated in c(a 3 ) = a 3 /(a 3 — 1) G [1,°°). For example, 
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a complete decomposition at a high band limit with a dyadic decomposition of the 
scales a = 2 gives C ~ 15/7 x &(MB 3 ). Obviously, the more compact the support 
of the scale-discretized wavelet, the larger the computational complexity. 

Let us finally comment on the exactness of the analysis and reconstruction 
algorithm described. If pixelizations of § 2 and SO(3) are chosen where a sampling 
theorem holds, exact quadrature rules hold both for the direct spherical harmonic 
transform of a band-limited signal in the analysis part, and for the direct Wigner 
D-function transform of its wavelet coefficients in the reconstruction part. Such 
exact quadrature is accessible on equiangular or Gauss-Legendre pixelizations on 
§ 2 for sampling of the original signal on points C0i , and for sampling the wavelet 
coefficients at each analysis depth j and for each value % p on points (coo)^. In this 
context, if the computed wavelet coefficients are not altered before reconstruction, 
the exact same samples are obtained after reconstruction as for the original signal F. 
Again, HEALPix pixelizations provide nonexact but very precise quadrature rules. 

Details on the algorithmic structure, computation times, memory requirements, 
and numerical stability of the corresponding implementation may be found in 
[238], 3 


4.7 Applications 


In this section, we illustrate the usefulness of the wavelet formalisms developed for 
analysis and reconstruction of signals in the context of applications in astrophysics 
and neuroscience. 


4.7.1 Cosmic Microwave Background Analysis 

The aim of cosmology is the study of the structure and evolution of the universe. 
The last decades have led us to the dawn of a new era of precision cosmology, 
characterized by access to more and more precise observations of our universe. One 
of our best laboratories is the cosmic microwave background radiation (CMB). 

The CMB is a relic black-body radiation, a unique realization of a random 
process that occurred in the early universe. The radiation is observed in all direc¬ 
tions of the celestial sphere. The corresponding data crystallize various forms of 
complexities. The data are distributed on the surface of the celestial sphere. Cur¬ 
rent and forthcoming experiments give access to high angular resolutions on the 
celestial sphere, and therefore large volumes of data. The radiation is described not 
only by a scalar temperature field but also by tensor polarization parameters. The 
observed CMB signal is inevitably contaminated by galactic and extragalactic fore¬ 
ground emissions that allow only partial sky coverage. Moreover, data are inevitably 


3 Code for Steerable and Scale-Discretized Wavelets on the sphere (S2DW) is available for down¬ 
load at the following URL: http: / /www. mrao . cam. ac . uk/~ j dm5 7 / software. html 
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contaminated by instrumental noise, blurred by an experimental beam, and affected 
by other sources of systematic errors. 

In this context, the analysis of CMB data is of major importance to obtain a 
precise picture of the universe through the definition of a cosmological model that 
best fits the data. Wavelets on the sphere appeared in recent years as an essential 
tool for CMB data analysis. Corresponding analyses of the Wilkinson Microwave 
Anisotropy Probe satellite mission (WMAP) data have allowed the probe of the 
fundamental pillars of our models. Gaussianity and statistical isotropy of the random 
process from which the CMB arises have been studied and corresponding anomalies 
were found in the data, thereby confirming and synthesizing other analyses. The 
presence of a very poorly understood form of energy in the universe, called dark 
energy , was also assessed and confirmed independently of other probes through 
wavelet analyses. These analyses were based on the decomposition of the CMB 
temperature signal on the sphere with continuous axisymmetric, directional, and 
steerable wavelets. Statistical analyses of the corresponding wavelet coefficients led 
to the physical conclusions. The first of these analyses is reviewed in [185]. More 
recent analyses are specifically based on steerable wavelets [240, 230, 189, 239]. 

To quote only one of these applications, steerable wavelets have allowed the 
identification of anomalously preferred directions in the CMB data, through the 
following process. The steerability gives access to morphological measures of local 
features of the signal at a given analysis scale a and around each point ft)o. The orien¬ 
tation X*( a , <9o) of a local feature may notably be defined as the direction in which 
the signal response to the wavelet has maximum absolute value. This orientation 
may simply be obtained from relation (4.32). The corresponding absolute value of 
the signal identifies the intensity of the feature. The alignment of local CMB features 
toward specific directions on the celestial sphere can then be probed by combining 
the information on the orientation and on the intensity of local features. First, the 
great circle is defined, which passes by the point (Oq and admits as a tangent the 
local direction defined by (a, (Do). All directions on that great circle are consid¬ 
ered to be seen by the local feature at (Oq with a weight naturally given as the inten¬ 
sity at that point. At scale a , the degree of preference of each direction co is defined 
as the sum of the weights originating from all points in the original signal for which 
the defined great circle crosses the direction considered. Fig. 4.8 represents the 
result of this alignment analysis with a second Gaussian derivative wavelet at a 
typical scale of 10° of angular opening, as performed on WMAP data in compar¬ 
ison with simulations based on an assumption of isotropy of the universe. Further 
analysis of this result allowed the identification of a mean preferred plane with a 
normal direction close to the CMB dipole axis, and a mean preferred direction in 
this plane, very close to the ecliptic poles’ axis [240, 230]. 


4.7.2 Human Cortex Image Denoising 

Subtle changes in human cortical thickness are thought to be associated with 
neurological or clinical deficiencies. It is therefore important to detect these changes 



Fig. 4.8 : Cumulative probabilities map in Mollweide projection of a HEALPix pixelization 
(A^ide = 32), for the degree of preference of each direction on the celestial sphere as obtained 
by an alignment analysis of the CMB signal using continuous steerable wavelets. The observed 
pattern presents several great circles of anomalously high (red) and low (blue) preference with 
a value well higher, or lower, than the median value of the simulations. (Figure borrowed from 
[230].) 


using bioimaging modalities. Typically, cortical thickness maps are inferred over the 
brain surface through magnetic resonance imaging (MRI) examinations. 
Cortical thickness features are extracted from these maps and studied using 
statistical tests. The acquisition process is, however, unavoidably affected by noise. 
As discussed in [21], the outcome of the statistical tests is known to be greatly 
improved by spatial smoothing, specifically with low-pass Gaussian filters, which 
enhances the signal-to-noise ratio. Since the cortex is a very convoluted surface, 
there is no easy way directly to apply simple low-pass filtering. One successful 
method, illustrated in Fig. 4.9, is to first map the cortex to a sphere where scalar 
data can be analyzed in a simple way. 

In this context, scale-discretized wavelets on the sphere offer a very flexible 
and computationally efficient way to denoise data while preserving the most salient 
features and most important spatial variations. Denoising with wavelets obviously 
require a discrete formalism where the signal may be reconstructed after modi¬ 
fication of its wavelet coefficients. In [21], the spherical cortical thickness map 
is processed using scale-discretized axisymmetric wavelets on the sphere such as 
those discussed in Section 4.5. The wavelet coefficients are then thresholded by soft 
thresholding [69] in order to remove the noise, which is assumed to be uniformly 
distributed over the coefficients, while the most important spatial variations of the 
original data are encoded in the strongest wavelet coefficients only. This study shows 
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Fig. 4.9 : A cortical thickness map is measured from T1 MRI sequences. The scalar map is then 
inflated to a spherical surface before being processed. (Figure reproduced from [21], with the kind 
permission of the authors.) 


that wavelet denoising yields a significant improvement over spatial smoothing by 
Gaussian filtering, as illustrated in Fig. 4.10. 


Original thickness map 



Gaussian smoothing 


Spherical wavelet 



Fig. 4.10: Comparison of processing of a cortical thickness map by spatial smoothing using a 
Gaussian filter or by denoising using soft thresholding of wavelet coefficients. Wavelets allow 
for better reconstruction of morphological features. (Figure reproduced from [21], with the kind 
permission of the authors.) 
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The wavelet transform has grown into a mature tool for data analysis and has 
enjoyed large success in an amazingly broad spectrum of applications. In this 
chapter, we have reviewed several constructions that aim at extending wavelets to 
data defined in spherical geometry, which is a desirable generalization for many 
practical problems. 

Wavelets on the sphere implement multiresolution through specific dilation 
mechanisms, but the concept of directional correlation provides a flexible frame¬ 
work that unifies the various families of wavelets discussed here. We have seen that 
the continuous wavelet formalism on the sphere is a powerful analysis tool, while 
the scale-discretized wavelet formalism offers a full-fledged framework for digital 
data processing on the sphere, allowing for reconstruction from wavelet coefficients. 
Finally, this theoretical formalism is also backed up by computationally 
efficient algorithms. Virtually all applications involving wavelets can now be gen¬ 
eralized to data on the sphere. In particular, solving inverse problems in spherical 
geometry, notably denoising and deconvolution problems, by imposing some sort 
of sparsity of wavelet coefficients is of significant practical interest in application 
fields ranging from astrophysics to neuroscience. 
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Exercises 

1. Prove the Plancherel relation (4.8) from the orthonormality of the spherical 
harmonics. 

2. Prove that the stereographic dilation D a on points in (4.18) is the unique radial 
conformal diffeomorphism on § 2 . 

3. Prove that the expression A 1 / 2 ^, 0) = a~ l [1 + tan 2 (6/2)]/ [1 -f a -2 tan 2 (0/2)] 
is required to ensure unitarity of the dilation D(a) of functions in L 2 (§ 2 ,df2). 

4. Prove the pointwise product form (4.28) for the Wigner D-function transform of 
the directional correlation defining wavelet coefficients of a signal on § 2 . 

5. Prove that the admissibility condition (4.30) is a necessary and sufficient condi¬ 
tion for the reconstruction formula (4.29). 

6. Prove that the stereographic projection n in (4.44) is the unique radial conformal 
diffeomorphism mapping the sphere § 2 onto the plane M 2 . 

7. Prove that the prefactors in (4.44) are required to ensure the unitarity of the 
projection operator IT between L 2 (M 2 , d 2 x) and L 2 (§ 2 , dQ ). 

8. Prove the relation (4.62) for the factorization of rotations. 

9. Prove that the admissibility condition (4.59) for continuous wavelets simply 
turns into the resolution of the identity (4.74) after scale discretization. 

10. Prove the expression (4.75) for the maximum analysis depth Jb(oc). 


Chapter 5 

Wiener’s Lemma: Theme and Variations. 
An Introduction to Spectral Invariance and 
Its Applications 


Karlheinz Grochenig 


Abstract Wiener’s Lemma is a classical statement about absolutely convergent 
Fourier series and remains one of the driving forces in the development of Banach 
algebra theory. In the first part of the chapter—the theme—we discuss Wiener’s 
Lemma in detail. We prove Wiener’s Lemma and discuss equivalent formulations 
about convolution operators. We then extract the underlying abstract concepts from 
Banach algebras. In the second part of the chapter—the variations—we discuss 
several, mostly noncommutative reincarnations of Wiener’s Lemma. We will 
develop some of the theoretical background and explain why Wiener’s Lemma is 
still useful and inspiring. The topics cover weighted versions of Wiener’s Lemma, 
infinite matrix algebras, noncommutative tori and time-frequency analysis, 
convolution operators on noncommutative groups, and time-varying systems and 
pseudodifferential operators. 


5.1 Introduction 

Wiener’s Lemma is a classical and seemingly innocent result about absolutely 
convergent Fourier series. In its original version it asserts that the pointwise 
inverse of an absolutely convergent Fourier series without zeros is again an 
absolutely convergent Fourier series. This result is contained in every text about 
Fourier series and in every treatment of commutative Banach algebras. Norbert 
Wiener needed this lemma for his proof of a “Tauberian theorem” [241, 242]. 

But Wiener’s Lemma is much more: Wiener’s Lemma is central in the develop¬ 
ment of the abstract theory of Banach algebras and has inspired Gel’fand’s theory of 
commutative Banach algebras [100, 101]. The generalizations of Wiener’s Lemma 
are now legion, and this chapter bears witness to some recent developments. 
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Why is Wiener’s Lemma so important? 


Wiener’s Lemma is a deep result about the invertibility and spectrum of cer¬ 
tain operators. 


For the direct verification that the function 1// possesses an absolutely conver¬ 
gent Fourier series, we would have to calculate or estimate the Fourier coefficients 
of 1// and check that they are summable. Wiener’s Lemma offers a much easier 
test. We only need to make sure that / does not have any zeros; this means that 1 // 
exists as a continuous function. This is the heart of Wiener’s Lemma: A difficult 
condition for the invertibility can be replaced by an easier, more evident, and more 
convenient condition. 

Clearly, the understanding of the invertibility is of tremendous importance for 
solving systems of linear equations or operator equations. Indeed, parallel to our 
journey through the manifold aspects of Wiener’s Lemma, we will discuss its rel¬ 
evance in signal analysis. In engineering Wiener’s Lemma plays a vital role in the 
analysis of signal transmission, time-invariant and time-varying channels, and for 
the signal recovery in wireless communications. 

The realm of Wiener’s Lemma is a vector space of functions that can be multi¬ 
plied by each other. Adding a norm and completeness, the mathematical structure 
in the background is that of a Banach algebra. Thus, from an abstract point of view 
Wiener’s Lemma is about invertibility in a Banach algebra and provides an easy 
criterion. A function / is invertible as an absolutely convergent Fourier series if and 
only if it is invertible as a continuous function. 

The invertibility of a function or an operator is closely related to its spectrum, 
and we will see how Wiener’s Lemma can be formulated as a result about spectral 
invariance. The spectrum of an absolutely convergent Fourier series / is independent 
of the underlying Banach algebra. The concept of spectral invariance is fundamental 
in many fields. One of our objectives is a unified treatment of spectral invariance 
in several areas of mathematics and to trace back several fundamental results on 
spectral invariance to Wiener’s Lemma. 

Following the title, this chapter is divided into two parts, the theme and its 
variations. 

Section 5.2 discusses the classical version of Wiener’s Lemma under different 
angles. We introduce absolutely convergent Fourier series and then elaborate an ele¬ 
mentary proof of Wiener’s Lemma. This proof is void of abstract theory and requires 
only elementary estimates from analysis. Only then do we provide the structural 
background and interpret Wiener’s Lemma in the context of Banach algebra theory. 
In Section 5.2.5 we discuss the main concepts of spectral invariance and the imme¬ 
diate consequences. Finally, we convert Wiener’s Lemma to a statement about the 
spectrum of convolution operators. 

The material is fundamental for an appreciation of the variations, because the 
sequence of definitions and results sets the pattern for the treatment of the variations, 
which are usually considered parts of different fields of mathematics. 
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Section 5.2 is elementary and requires only basic analysis. We do not even need 
the Gel’fand theory of commutative Banach algebras, although some knowledge of 
Banach algebras may be helpful to appreciate the context and to open more perspec¬ 
tives on Wiener’s Lemma. 

Section 5.3 is devoted to variations of Wiener’s Lemma in several areas of math¬ 
ematics, namely in Fourier analysis, infinite matrix algebras and operator theory, 
in the noncommutative geometry of tori and time-frequency analysis, in the 
harmonic analysis of locally compact groups, and in the theory of pseudodifferential 
operators. 

In each of the variations we will set up the basic concepts, and then draw the anal¬ 
ogy between Fourier series and the new object, series of time-frequency shifts, say. 
The analogy to Fourier series usually suggests some natural questions; in particular, 
we will always wonder whether a version of Wiener’s Lemma holds. We then will 
proceed to formulate answers that are similar to the classical Wiener’s Lemma. The 
proofs are usually much more advanced, and some proofs are outside the scope of 
this introduction. Wiener’s Lemma is often hidden in the proofs, and we will try to 
make the connections visible. Section 5.3 can no longer be self-contained, because 
the mathematical concepts are drawn from disjoint mathematical worlds. Our goal 
is to reveal the common structure and convince the reader that the topics treated are 
indeed variations of a classical theme, albeit the variations may be highly nontrivial. 

In Section 5.3.1 we introduce weights and investigate weighted versions of 
Wiener’s Lemma for Fourier series. We discuss the dichotomy between subexpo¬ 
nential and exponential weights and define the Gel’fand-Raikov-Shilov condition. 
This is a new concept that arises invariably in spectral problems with weights. 

Section 5.3.2 deals with the spectral invariance of matrices with some form of 
off-diagonal decay and is a first version of a noncommutative Wiener’s Lemma. 
In Section 5.3.3 we turn to time-frequency analysis and investigate series of time- 
frequency shifts instead of Fourier series. The next section, 5.3.4, treats convolution 
operators on general locally compact groups. Here we have to content ourselves with 
explaining the concepts of harmonic analysis, stating the results, and discussing 
their meaning. Section 5.3.5 is devoted to pseudodifferential operators and their 
invertibility. We discuss a class of symbols (the so-called Sjostrand class) that 
resembles absolutely convergent Fourier series and formulate the correct general¬ 
ization of Wiener’s Lemma for pseudodifferential operators. 

In the last section, 5.3.6, we explain how and why the results on pseudodifferen¬ 
tial operators can be used for the analysis of time-varying systems and in wireless 
communications. 


5.2 Wiener’s Lemma—Classical 

Let us motivate Wiener’s Lemma with a familiar statement from calculus. Let 
C k ( T) be the space of k -times differentiable functions with period 1. We identify 
the interval [0,1) of a period with the torus T = {z G C : |z| = 1} when necessary. 
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The product rule ( fg )' = f'g + fg f implies that C k ( T) is an algebra. The quotient 
rule for differentiation implies the following property of C k (T). 

Lemma 5.1. Iff G C k { T) and f{t) ^ Oforallte [0,1), then 1// G C k ( T). 

Proof Since (1 //)' = —f'/f 2 , 1// is continuously differentiable whenever 
f(t) 0. Now proceed by induction. Assume that we already know that 1// G C^(T) 
for I < k. Then (1 //)' = —f'/f 2 is continuous and in C e ( T) by the induction 
hypothesis. Therefore 1// G C^ +1 (T). □ 


5.2.1 Definitions from Banach Algebras 


The quotient rule and Lemma 5.1 are about the invertibility of differentiable 
functions. The abstract discussion of invertibility is best carried out in the context 
of Banach algebras. To begin with, let us recall the standard definitions. 

Definition 5.2. A Banach space sf is called a Banach algebra if it possesses a mul¬ 
tiplication sf x sf —► sf that satisfies the following properties for all and 

AgC: 

1. (a + b)c = ac-\-bc and a(b J rc) = ab + ac\ 

2. ( ab)c = a(bc); 

3. (Xa)b = a{Xb) = X(ab)\ 

4. \\ab\\<\\a\\\\b\\. 

We always assume that s/ possesses a unit element e that satisfies ae = ea = a 
for all a G sf. In a unital algebra an element a is called invertible if there exists an 
element b G sf such that ab = ba = e. If such a b exists, it is unique and called the 
inverse of a and denoted by a~ l 
If we endow C k ( T) with 
becomes a Banach algebra with respect to pointwise multiplication (see Exercises). 
Lemma 5.1 provides a simple and in this case rather obvious condition for the 
invertibility of an element in the algebra C k (T). 


the norm ||/|| c r = then C k (T) 


5.2.2 Absolutely Convergent Fourier Series 

Let us now introduce si (T), the main object of Wiener’s Lemma. 

Definition 5.3. A periodic function / possesses an absolutely convergent Fourier 
series if it can be written as f(t) = 'Lk£i a k e2nikt with coefficients in a G In 

this case we write / G s/(T) and endow s/(T) with the norm 

ll/IU = Ni = XM. 

It is not completely obvious that || • ||^ is a norm. This fact follows from the 
uniqueness of the Fourier coefficients. 
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Lemma 5.4. The space srf (T) is a Banach algebra under pointwise multiplication. 

Proof. Let/(f) = I ke za k e 2mkl andg(f) = I ke zb k e 2mkl with norms \\f\\^ = ||a||i 
and ||g|U = ||b|| i- Then 





2nikt 



I a k b ie ^ k+l * 

k,iez 


X ( Yj a k b n-k\e 2llint . 
nEZ \kEZ J 


(a*b)(/z) 


(5.1) 


The interchange of the summation is justified because both series converge 
absolutely. Thus, the coefficients of the pointwise product fg are given by the con¬ 
volution of the sequences a and b. Now 

||a*b||i = £ Y, a k b n-k 

nEZ kEZ 

- X X \ a k\\ b n-k\ = ||a||l I|b||, , 

kEZnEZ 


and consequently, 

\\fg\U = \\^Hi<\\4i\Mi = \\f\U\\g\U- 

The other properties of Definition 5.2 are obvious, and thus ^(T) is a Banach 
algebra. □ 

In Section 5.3 we will encounter several variations of this proof in rather different 
contexts. 


5.2.3 Wiener’s Lemma 

As with the algebra C*(T), we may now investigate the invertibility for absolutely 
convergent Fourier series. The formulation of Lemma 5.1 suggests the following 
question: If / G sf (T) and f(t ) 0 for all t G [0,1), is / then invertible in gtf (T)? In 

the absence of a quotient rule the answer is by no means obvious. It is given by the 
following theorem, which, with historical understatement, is now called Wiener’s 
Lemma. 


Theorem 5.5 (Classical formulation). Iff G srf (T) and f(t ) 0 for all t G T, then 

also 1 // G ^(T), i.e., 1 /f(t) = YjkEZ^k e2nikt with b £ ^(Z). 
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Wiener’s original proof [241, 242] from 1932 uses a localization property and a 
partition-of-unit argument. Today’s standard proof is an abstract proof via Gel’fand 
theory [56, 150, 151,209]. 

Wiener’s Lemma has intrigued many analysts and is the seed for the abstract 
theory of Banach algebras by Gel’fand. In fact, Gel’fand [100, 101] developed his 
theory of commutative Banach algebras for the specific purpose of finding a con¬ 
ceptual proof of Wiener’s Lemma. Variations of Wiener’s lemma and its proof were 
found by Levy [165] and Zygmund [248]. An interesting proof of Wiener’s Lemma 
without the use of Fourier series was given by Hulanicki [142]. A very short, ele¬ 
mentary proof was found by Newman [194]. 

Quantitative aspects of Wiener’s Lemma, namely norm estimates for the inverse 
f~\ were investigated by Nikolski [195] and Tao [222]. 


5.2.4 Proof of Wiener’s Lemma 


Here we give an elementary proof of Wiener’s Lemma following Newman [194] 
and Hulanicki [142]. It is of interest in its own right because it does not use Fourier 
series and avoids the abstract notions of Gel’fand theory. 

Step 1. Reduction to special case. If / G si (T), then also f € #/ (T) and |/| 2 = 
/• / G g/(Y). Since 1// = //|/| 2 , it suffices to show that 1/|/| 2 G 
By replacing / by |/| 2 , we may assume without loss of generality that / is 
nonnegative. By normalizing, we may further assume that 0 < f(t) < 1 for t G T. 

Now note that since / is continuous, the assumption f{t) ^ 0 for all t implies 
that 

inf 1/(01 = 5 > 0. (5.2) 

te T 

Step 2. Analyze the invertibility of / in C(T) by a geometric series. Let 

h = 1 — /; then 

0 < h(t) = 1 — f{t) < 1 — 5. 

Hence, the geometric series Y£=oh(t) n converges in C(T) and possesses the limit 


X A (0' 

n =0 


l 

l-h(t) 


M ec (T) . 


Our goal is to show that X h n also converges in 

Step 3. Approximate h by a trigonometric polynomial. Given £ > 0, choose a 
trigonometric polynomial p(t) such that 

\\h-p\\^<e. 

If fact, if /(f) = Y l kez a k e2mkt » we ma Y choose p(t) = 1 — 'L\k\<N a k e2n,k ' for 
sufficiently large N and obtain \\h — p\\^ = X|fc|>iv \ a k\ < e. 

Set r = h — p; then h = p + r and \\r\\^ < £. 



5 Wiener’s Lemma: Theme and Variations 


181 


As to the choice of e, we will see that we must have 1 — <5 + 2e < 1, where 5 is 
given by (5.2). 

Step 4. Some elementary estimates. First, if q(t) = Y,\k\<N^k e2nikt is a trigono¬ 
metric polynomial of degree N , then 

h\U= I \h\ < \\b\\2 (2N+1) 1/2 

\k\<N 

= hh ( 2N + 1) 1/2 < lkll~ (2N + 1) 1/2 . 

Second, if q is trigonometric polynomial of degree N , then q k is a trigonometric 
polynomial of degree kN. 

Step 5. Estimate the si -norm of h n . By the binomial theorem we have 

'>"=i (t)/''-*. 

so we may estimate 

ii^iu<i(”)n/^iu 

<s(")ii/iuii^iu- 

Now by our choice of p in Step 3, we have ||r w_ ^||^ < ||r||^ < e n ~ k . By Step 4 
applied to the trigonometric polynomials p k , we have 

\\p k \U < ||/>*||~(2Aft + 1) 1/2 < (2Nn+ l) 1/2 |b||L 

Step 6. Complete the estimate for the si -norm of h n . 

nU<(2iVn+l) 1 / 2 t 

= (2JVn+ l) 1 /2(|| jP || oo _ | _ e )» < (2Nn + l) 1 / 2 (\\h-r\\„ + e) n (5-3) 
< (2Nn+ l) 1 ^ 2 (\\h\\ a , + 2e) n < (2tfn + l) 1 / 2 ^ -8 + 2e) n . 

< 1 

Step 7. Convergence of geometric series in si -norm. Using the estimate from 
the previous step, we finally obtain that 

inu< £(2Ar»+l) 1/2 (l-S + 2 £ r<oo. (5.4) 

n =0 n =0 


Thus, the geometric series £“ =0 h n converges in ,(/ (T), and we have proved that 
i// = Xr=o^e^(T). □ 
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Remark 5.6. Let us take nth roots in (5.3). Then we obtain 


\im\\h'‘C<\\h\\„ + 2e. 


Since £ > 0 was arbitrary, this implies that 


lim||A"||^<||A|| 


lim \\h n C <||A||.= lim||A"||i/". 


n- 


Since \\h n \\co < ||/^||^ always holds, the estimate in (5.3) implies that 



(5.5) 


This identity can be interpreted as a statement about spectral radii. It is 
fundamental for the generalizations and abstract versions of Wiener’s Lemma. See 
Proposition 5.11. 

Remark 5.7. The proof above is certainly not the simplest proof and lacks the 
elegance of Gel’fand theory. However, in view of the many variations and general¬ 
izations of Wiener’s Lemma, it is important to understand which properties of si (T) 
come into play. First, we have studied the problem on the dense subspace of trigono¬ 
metric polynomials; second, we have compared several norms, namely the L 2 -norm, 
the L°°-norm, and the si -norm. The si -norm is rather tricky, because it is defined 
indirectly via the Fourier coefficients. The comparison of || • ||^ with more accessi¬ 
ble norms is therefore natural. Last but not least, we used that si (T) is commutative, 
when we applied the binomial theorem in Step 5. 

Remark 5.8. The final estimate (5.4) of the proof leads to an estimate for the norm 
of 1// in si (T). The norm || 1//||^ depends on <5, on £, and on N, which in turn 
is a function of £. It can be shown that there is no control of ||1//||^ in terms 
of 5 alone [195]. The problem of norm-controlled inversion in Banach algebras is 
rather difficult, and in general little can be said. We refer to the beautiful work of 
Nikolski [195] and an essay by Tao [222]. 

5 . 2.5 Abstract Concepts — Inverse-Closedness 

Following Naimark, let us now take a very abstract look at Wiener’s Lemma. 
Naimark [191, 192] turned Wiener’s Lemma into a definition. This is not a cheap 
trick (to avoid a proof), but Naimark’s procedure conveys an important insight into 
Wiener’s Lemma. Naimark understood that Wiener’s Lemma is a result about the 
relationship between two Banach algebras, namely, the algebras si (T) and C(T). In 
particular, the condition “f(t) ^ 0, Vt G T’ occurring in Theorem 5.5 simply means 
that / is invertible in C(T). This observation justifies the following definition. 

Definition 5.9. Let si C SB be two Banach algebras with a common identity. Then 
si is called inverse-closed in SS if 



a 


-l 


G si 
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In other words, si is inverse-closed in 38 if an element a in the smaller algebra 
that is invertible in the bigger algebra is automatically invertible in the smaller alge¬ 
bra. Or to put it differently, an element a G si is invertible in si if and only if a is 
invertible in 38. 

The inverse-closedness is often extremely useful for the study of invertibility. 
The large algebra 38 contains more invertible elements; it may therefore be easier 
to check the invertibility of an element. If we start with a G si, then a~ l is auto¬ 
matically in si. The direct verification that a is invertible in si may be much more 
difficult. 

In the light of Definition 5.9, Wiener’s Lemma states that the algebra of 
absolutely convergent Fourier series si( T) is inverse-closed in the algebra C(T). 
For most mathematicians it is easier and more natural to see that / does not have 
any zeros on the interval [0,1). The direct verification of 1// G si(T) would require 
finding the Fourier coefficients of 1// and checking whether they are absolutely 
summable. 

Inverse-closedness is an important concept in many area of mathematics where 
Banach algebra arguments and spectra of operators play a role. Each area uses its 
own terms, and so there is a Babylonian confusion in the terminology. We follow 
Barnes [13]. Naimark uses the term Wiener pair (si ,38) when si is inverse-closed 
in 38. Palmer [196] says that si is a spectral subalgebra of 38. In ^-theory one 
says that si is a local subalgebra of 38 [22]; in the Russian literature (or rather its 
translations into English) si is a full subalgebra of 38. Connes [55] says that si is 
invariant under holomorphic calculus in 38 , and Arveson [8] calls si a spectrally 
invariant subalgebra of 38 and uses the term spectral permanence. Some of the ter¬ 
minology will become clearer when we discuss the properties of inverse-closedness 
in more detail. 


5.2.5.1 Spectral Invariance 


Recall that the spectrum of an element a in Banach algebra si (with unit e) is defined 
to be the set 

Og/(a) = {A G C : a — Xe is not invertible in si}. 

The spectral radius of a is 

r^(a) = max{|A| : A e a'^(a)} = lim ||a"||^". 

The last identity is the fundamental spectral radius formula. See [27, 56, 150, 209]. 


Lemma 5.10. Let si C 38 be two Banach algebras with a common unit e. Then the 
following statements are equivalent: 

1. si is inverse-closed in 38. 

2. Og/(a) = (a) for all a G si. 


Proof. (1) = 
means that 


(2): If A 0 Og/(a), then (a — Xe) 1 G si C 38, so A 0 G@(a). This 
o@(a) C Gss(a). 
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This argument shows that the inclusion si C SB always implies that 

o&( a ) £ 

Conversely, if a G si and A 0 (7^ (a), then (a — ie)~ l €lS%. Since ^ is inverse- 
closed in <^, (a — Ae) _1 must also be in si, and so A 0 o^{a) and thus 

o>(a) C G@(a). 

(2) => (1): a G si ,a~ l G ^ means 0 0 <7^>(a), so 0 0 G^{a) and a -1 G si. Thus, 
si is inverse-closed in SB. □ 

Lemma 5.10 explains why the term spectral invariance is often used in 
connection with an inverse-closed subalgebra. 

In the light of Lemma 5.10, Wiener’s Lemma states that 

°>(T)(/) = °C(T)(/) =/(T). (5.6) 

In general, it is very difficult to verify when an algebra si is inverse-closed in 
SB. Hulanicki’s lemma [143] yields an important criterion for, and offers a strategy 
to prove, inverse-closedness. In this regard Hulanicki’s result lies somewhat deeper 
and requires some additional property of the larger algebra SB. 

Recall that an involution a —» a* of an algebra si is a mapping that satisfies the 
following properties: (a) (A a + pb)* = A a* + jib* for all a,b G si and A,/i G C; 
(b) (a*)* = a for all a G si; and (c) (ab)* = b*a* for all a,b esi. A Banach algebra 
with a continuous involution is called a Banach *-algebra . A Banach *-algebra si 
is called symmetric if G^(a*a) C [0,°®) for all a G si; i.e., the spectrum of positive 
elements is positive. 

Proposition 5.11 (Hulanicki’s lemma). Assume that si C SB are two Banach 
*-algebras with a common unit element and common involution. Assume that SS 
is symmetric. Then the following are equivalent: 

1. si is inverse-closed in SB. 

2. r(a) = (a) for all a = a* G si. 

3. r(a) < (a) for all a = a* G si. 

If one of these conditions is satisfied , then si is also symmetric. 

Thus, instead of verifying the spectral identity of Lemma 5.10, it suffices to verify 
the equality of two spectral radii. The spectral radius is an analytic concept (whereas 
the spectrum is an algebraic notion), and the equality of spectral radii in condition 
(2) can be attacked with analytic methods. In this way Hulanicki’s lemma offers a 
strategy to verify inverse-closedness. 

Proof. The implication (1) => (2) follows from Lemma 5.10, and the implication 
(2) => (3) is obvious. The implication (3) => (2) follows from the inclusion 
G@{a) C G^{a) and the ensuing inequality r@(a) < r^{a), which always hold 
when si C SB (see the proof of Lemma 5.10). 
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The heart of Hulanicki’s lemma is the nontrivial implication (2) => (1). The idea 
is that the identity of spectral radii r^(a) = r@(a) implies that power series have 
same radius of convergence in si and in 38. 

We first treat a special case: For b = b* G si consider the geometric series 
C = lT =0 (e - b) k . If rag(e - b) < 1, then this series converges in 38. In this case 
the sum is c — b~ l . By the spectral identity r^(e — b) = r^(e — b) < 1; hence, this 
series also converges in si. Consequently, its sum c = b~ l belongs to the smaller 
algebra si. 

To treat the general case, assume that a G si is invertible and a~ l G 38. Consider 
the element b = (2\\a*a\\&)~ l a*a G si C 38. Then b is invertible in 38, and 
11^11^ = 1/2- Since 38 is assumed to be symmetric, the spectrum of b is contained in 
[0, oo). The invertibility of b implies that 0 is not in the spectrum, and the norm bound 
implies that the spectrum is contained in a disc of radius 1/2. Since the spectrum is 
a compact set, there exists a 8 > 0 such that 

03 g(b) C [S,±], 

Consequently, <J&(e — b) C [1/2,1 — 5] and 

P&( e ~ b ) <1-8 <1. 

This is the situation of the special case above, and we may conclude that 
b~ l = XJT=o(^ — b) n converges simultaneously in 38 and in si, whence b~ l G si. 

Now e = b~ l b = ((2\\a*a\\&)~ 1 b~ l a*)a and thus a possesses the left inverse 
c = (2||a*a||^) _1 b~ l a*. By applying the same argument to b = ( 2\\aa*\\^)~ l aa* G 
si C«f, we obtain a right inverse of a of the form (2\\aa"\\^)~ l a"b~ l G si. Thus, 
a is invertible in si. 

Finally, since 38 is symmetric, we know that G@(a*a) C [0,°©) for all a G si. 
Since si is inverse-closed in 38, 0 ^( 0 * a) = o&(a*a) C [0,©o) for all a G si, and so 
si must be symmetric. □ 

Note that the structure of the proof of Hulanicki’s lemma is identical to that of the 
proof of Wiener’s Lemma (Theorem 5.5). In both cases we first restricted to positive 
elements for which we used geometric series to investigate their invertibility. 

Since the symmetry of a Banach algebra with involution is usually difficult to 
verify (and still a topic of many unsolved problems), Hulanicki’s lemma is usually 
applied in the following form. 

Lemma 5.12. Let si be an involutive Banach algebra with identity e. Suppose that 
there exists a one-to-one *-homomorphism n from si into 38(3^), the C* -algebra 
of bounded operators on a Hilbert space 38, such that 71 (e) = Id^. If 

r &?( a ) = \\n( a )\\op for all a = a* G si , (5.7) 


then 


(V(a) = a^){n{a)) 


In particular, si is symmetric. 


for all a G si. 


(5.8) 
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Proof. Since n is assumed to be faithful (i.e., one-to-one), we may identify si with 
a *-subalgebra of SS = With this identification and Proposition 5.11, (5.7) 

implies the spectral identity (5.8). Furthermore, 3§(Jtjf) is symmetric, because we 
know from functional analysis that the spectrum of operators of the form T*T is pos¬ 
itive. Consequently, a^(a*a) = o^(j^\(n(a)*n(a)) C [0,°®), and si is symmetric. 

Remark 5.13. In the jargon n is called a faithful *-representation of si by bounded 
operators on the Hilbert space Jff. To prove the symmetry of a Banach *-algebra, 
one often constructs a faithful representation n on a Hilbert space and then tries to 
establish the identity of spectral radii (5.7). Lemma 5.12 thus gives us a glimpse of 
the important relationship between symmetry and representation theory. 

At first glance, symmetry is a property of a single algebra, whereas 
inverse-closedness is a relationship between two algebras. Nevertheless the two con¬ 
cepts are closely related. Let us digress for a moment and describe their connection. 
In the abstract theory of Banach ^-algebras one can assign a C*-algebra to every 
Banach *-algebra. Consider the set 5? of all *-seminorms on si ; i.e., 5? contains 
all seminorms p on si satisfying the C*-condition p(a*a) = p(a) 2 for all a G si. 
Note that si is usually not complete with respect to such a seminorm. Now define 
the maximal C*-seminorm on si, the so-called Gel’fand seminorm, by 


Ys/( a ) = sup{/?(a ): p e Y}. 


The completion of the quotient si /{a G si \ (a) = 0} with respect to the maximal 
C*-seminorm y^ is a C*-algebra and is called the enveloping C* -algebra of si, 
denoted by C*(si). Now we can formulate the relationship between symmetry and 
inverse-closedness. 

Proposition 5.14. Assume that y^ is a norm on si. Then si is symmetric if and 
only if si is inverse-closed in the enveloping C* -algebra C*(si). 

For a proof see [196, 11.4]. One implication is easy. If si is inverse-closed in 
C* (si), then Lemma 5.10 implies that G^(a*a) = ( a * a ) f° r all a G^. Since 

every C*-algebra is symmetric [27, 196], the spectrum of positive elements a*a is 
contained in [0,°®). This means that si is symmetric. 

Functional Calculus. Recall how the Riesz functional calculus (or holomorphic 
functional calculus) works. Fix an element aG<f with spectrum Ggg(a). Let / be an 
analytic functions on an open neighborhood O of c@(a) and let y C O be a contour 
of o@(a); i.e., y is a rectifiable curve and points in c@(a) have winding number 1, 
and points in the complement of O have winding number 0. Define the Banach- 
algebra-valued path integral 



(5.9) 


Here the integral can be understood as a Riemann integral (limit of Riemann sums); 
in particular, it is also defined weakly. If a* G si* is in the dual space of si, then 
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The mapping f —> f(a) is an algebra homomorphism from the commutative 
algebra of functions analytic on some neighborhood of a '@{a) into a commutative 
subalgebra of si. For a detailed exposition of the functional calculus, see [56, 209]. 

Usually the functional calculus depends sensitively on the underlying algebra. 
We therefore note an immediate, but important, consequence of the spectral 
invariance. 

Corollary 5.15. Assume that si is continuously embedded in 38 and inverse-closed 
in 38. Then the Riesz functional calculi for si and 38 coincide. 

Proof. Since a '#/(a) = <J&(a) by Lemma 5.10, the set of analytic functions / for 
which f(a) is well defined by (5.9) is the same for si as for 38. Thus, the path 
integral in (5.9) defines an element in both si and 38. Since si is continuously 
embedded 38, the limit of Riemann sums is the same in si as in 38. Thus, f(a) is 
defined unambiguously and does not depend on the algebra. □ 

Corollary 5.15 is extremely useful in situations when the existence of f(a) 
is known by other means. The most common situation is when 38 is 38(348), 
the algebra of bounded operators on a Hilbert space. In this case we have the 
continuous functional calculus at our disposal and know how to establish the 
existence of (square) roots, absolute values, and other functions of positive oper¬ 
ators. Assume that si is inverse-closed in 38(348) and that a is a positive invertible 
element in si, i.e., a = b*b for some b E si. Then f(z) = z° is analytic on a neigh¬ 
borhood of Cg/(a) C [a,/3], a > 0, and a G makes sense in 38 for arbitrary o E M. 
By Corollary 5.15, a G E si as well. 

As another consequence we state an early result from the theory of absolutely 
convergent series, which is known as the theorem of Wiener-Levy [165, 248]. 

Theorem 5.16. Assume that h E si( T) and that f is holomorphic on an open set 
containing the image h( T). Then f oh E si (T). 

Proof. We use the Riesz functional calculus to compute f(h). Choose a contour y of 
o>( T ) (h) = h(T) [by (5.6)] and let 8 t E si (T)* be the point evaluation 8 t (h) = h(t). 
Then with the weak definition of the Banach-algebra-valued integral we obtain 


fm) = (8 t j(h)) 



where in the last step we used Cauchy’s integral formula. By the properties of the 
functional calculus, f(h) E si( T), and by the above computation, f(h) = f oh. 
Hence, f oh possesses an absolutely convergent Fourier series. □ 

Note that if we take f(z) = l/z, we recover the classical formulation of Wiener’s 
Lemma. 
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5.2.6 Convolution Operators 


Wiener’s Lemma can be reformulated as a statement about the spectrum of convo¬ 
lution operators. This formulation is useful for concrete problems in signal analysis. 

As a motivation let us introduce some engineering terminology. In engineering 
a system is a black box that transforms an input signal a into an output signal b. 
Usually this transformation is assumed to be linear, so for a mathematician a system 
is just a linear operator A. In other applications a is a signal that is transmitted 
by a sender and b is the received signal. In this case, the operator A describes the 
distortion of a. In this context one speaks of a channel rather than of a system. 

The goal is to understand the properties of the “channel” and the input-output 
relationship. A specific goal in signal processing is to calculate or estimate the input 
a from a measured output b. This process is called equalization and amounts to 
solving the equation Aa = b for a or to inverting A. 

The simplest systems are discrete time-invariant systems corresponding to a 
black box with constant characteristics or to a stationary transmission environment. 
As before we denote sequences with boldface letters a,b, etc., and their entries 
with a(k),b(k) or a^bj^k E Z. Let T r denote the translation operator; it acts on a 
sequence a by (: T r a)(k ) = a(k — r) for k, r E Z. Time invariance means that if the 
input a results in the output b, then the shifted input 7>a results in the output T r b. 
For a linear system we then have 

AT r a = T r Aa, Vr e Z, (5.10) 

for “all” sequences a. Mathematically, a time-invariant system is therefore an 
operator that commutes with translations. 

Let 8 k,k G Z, be the standard basis of £ 2 (Z) defined by 8^(1) = 1, if / = k, and 
8 k(l) = 0, if / 7 ^ k. Then 8 k = 7i<5o and every sequence on Z can formally be written 

as a = I kez a ( k )8k = 'LkeZ a ( k ) T kSo- 
Then by (5.10) we find that 

(Aa)(Z) = A (0 

= ^a(k)T k (A8 0 )(l) 

k 

= y j a(k){A8a)(l-k) 

k 

= (a*(AS 0 ))(/). 

Thus, the time-invariant system A is the convolution with the sequence h := A<5o- 
This sequence is the response of the system to the “pulse” <5o and therefore is called 
the impulse response. 

We denote the convolution operator = a * h and call h the symbol of Ch. 
Our informal argument shows that every time-invariant linear system is uniquely 
defined by its impulse response h, and conversely that every sequence h defines a 
unique time-invariant system A = Ch. 
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For physical reasons we may assume that signals of finite energy (i.e., finite 
7 2 -norm) are mapped to signals of finite energy; in other words, A is bounded on £ 2 . 
Usually the symbol h is a finite sequence (a system with “finite impulse response”), 
so it is safe to assume that h is in £ l (Z). Under this assumption our deduction is 
rigorous, because all sums converge absolutely. 

In particular, we note the following boundedness property. 

Lemma 5.17 (Young’s inequality). 

7. Ifh G £ l (Z) and aG^(Z), then h*ae^(Z) and ||h*a|| p < ||h||i ||a|| p . 

2. Thus, if h G £ l (Z), then Ch is hounded on £ p for 1 < p < °°. The operator norm 
is hounded uniformly hy 

\\Ch\\iP^£p < ||h||i. 

By Young’s inequality a convolution operator with £ l -symbol is bounded 
simultaneously on all ^-spaces. So for h G 7 1 , Ch is an element of the Banach 
algebra £%(£ p ), the Banach algebra of bounded operators on £ P (Z). Let 

() (Ch) = {A G C : Ch — A Id & is not invertible on £ P (Z)} 

be the spectrum of Ch as an operator acting on £ P (Z). 

An immediate question is how the spectrum of Ch depends on the domain space 
£ P (Z). This question is important for signal analysis and interesting in its own right. 
In signal analysis one would like to deduce properties of the input a from the output 
b = Cha. In particular, if b G £ P (Z), can we assert that a is in the same £ P (Z )? 

The mathematical analysis of this problem brings good news for the engineer: 
The spectrum is independent of £ P (Z), and in particular the invertibility of a time- 
invariant system is independent of the domain space £ P (Z). 

We now come to the main point: Wiener’s Lemma is the main tool to understand 
and compute the spectrum of convolution operators. We first give a formulation 
of Wiener’s Lemma for convolution operators that is equivalent to the classical 
Wiener’s Lemma. 

Theorem 5.18 (Wiener’s Lemma for convolution operators). If h G £ l (Z) and 

Ch is invertible on £ 2 {Z), then the inverse operator is again a convolution operator 
Cjj" 1 = C g with a symbol g G £ l (Z). Consequently, Ch is invertible simultaneously 
on all £ P (Z), 1 < p < °°. 

Proof As a preparation, consider the Fourier series h(?) = Y,k^z^k elnikt of the 
sequence h. Since { e 2nikt : k G Z} is an orthonormal basis for L 2 (T), the mapping 
h —> h is a unitary operator from 7 2 (Z) onto L 2 (T). Consequently, if h G 7 2 (Z ), then 
the Fourier series h converges in L 2 (T) and h is defined almost everywhere. If, in 
addition, h G £ l (Z), then the Fourier series h converges absolutely and h G g/(T). 

Next, by reading (5.1) backwards, we know that the Fourier series of h*g, h,g G 
£ l (Z), is just the pointwise product 


h*g = h g. 


(5.11) 
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If h G £ l (Z) and g G ^ 2 (Z), then this formula holds almost everywhere and is an 
identity of two L 2 -functions. 

Consequently, the Fourier series of is 

Cha = ha = ha. (5.12) 

Thus, on the Fourier side, Ch becomes the multiplication operator Mga = h a. More 
precisely, we may write the identity (5.12) by means of a commutative diagram: 

£ 2 (Z) £ 2 (Z) 

r r (5.i3) 

L 2 ( T) -4l 2 (T). 

Since ^ is unitary, Ch has the same spectrum as Mg: 

(^%) = (Ch) = ran ^ • (5.14) 

In particular, Ch is invertible on £ 2 {Z) if and only if Mg is invertible on L 2 (T). This 
is the case if and only if \h(t) \ > 8 > 0 for almost all t G T. 

Here Wiener’s Lemma makes its decisive appearance: Since Ch is assumed to be 
invertible and h G £ l (Z), we have h(f) f 2 0 for all t G T. Theorem 5.5 asserts that 
1 /h possesses an absolutely convergent Fourier series. This means that there exists 
agG^ 1 (Z) such that g = 1/h. By (5.11), the equation gh = 1 implies that 

g*h = h*g = <5 0 . 

Consequently, for a G £ 2 (Z) we find, using (5.11) repeatedly, 

a = <5o*a= (h*g)*a 
= h * (g * a) = ChC g a. 

Thus, ChCg = Id £2 and likewise C g Ch = Id ^2 . So C^ 1 = C g with g G i 1 (Z) as claimed. 
Since both Ch and C g are bounded on every £ P (Z), 1 < p < the convolution 
operator Ch is also invertible on £ P (Z) with the inverse C g . □ 

Spectral Invariance of Convolution Operators. Finally, let us draw some conse¬ 
quences of Wiener’s Lemma for convolution operators. 

Theorem 5.19. Assume that h G £ X (Z). Then the following statements are 
equivalent. 

LCk is invertible on £ 2 (Z). 

2. Ch is invertible on £ P {Z) for all p G [1,°°]. 

3. Ch is invertible on £ P {Z) for some p G [1,°°]. 

Proof The implication (1) => (2) is Wiener’s Lemma for convolution operators 
(Theorem 5.18) and the implication (2) => (3) is trivial. To verify (3) => (1), let us 
assume that Ch is invertible on £ P (Z ) for some p G [1,°°]. 
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For the implication (3) => (1) we use a duality argument and interpolation. First 
let us prepare the argument. Let a —» a* be the usual involution a*(&) = a(—k) 
extended to £ P (Z). This involution is a bijective conjugate-linear isometry on £ P (Z) 
and satisfies the relation (a * b)* = b* * a* = a* * b* whenever the convolution is 
defined, in particular, when a G £ l (Z) and b G £ P (Z). Next, using the inner product 
(a,b) = ^keii a kbk f° r the duality between £ P (Z) and £ p ' (Z), l/p + l/p' = 1, it is 
easily checked that 

(C h a,b) = (a,C h *b) for h G £\a G £ p ,b G £ p> . 

Thus, the adjoint operator of a convolution operator with respect to (•,•) is 
(C h )*=C h *. 

Claim: For hGf 1 (Z) and fixed p G [1, <*>], the convolution operator Ch is invertible 
on £ P {Z) if and only if Ch* is invertible on £ P (Z). 

Since Ch*a = h* * a = (h * a*)* = (Cha*)* and * is a bijection on £ P (Z), Ch* is 
one-to-one if and only if Ch is one-to-one, and likewise Ch* is onto if and only if Ch 
is onto. 

Now assume that Ch is invertible on £ P (Z). By the claim, Ch* is also invertible 
on £ P (Z) and thus by a general principle [56, Prop. VI. 1.4] its adjoint (Ch*)* = Ch 
is invertible on the dual space £ p ' (Z ). Let M be the inverse of Ch on £ max ( p ’ p ') (Z); 
then clearly M is also the inverse of Ch on £ mm ( p ^ p ) (Z). Since M is bounded on both 
£ P {Z) and £ p ' (Z), the Riesz-Thorin interpolation theorem [151, 248] implies that 
M is bounded on £ 2 (Z). The factorization MCh = C^M = Id ^2 holds on the dense 
subspace £ mm ( p i p ) (Z), and thus Ch is invertible on £ 2 {Z) with inverse M, as was to 
be shown. □ 

Corollary 5.20. Ifh G £ X (Z), then 

a ^(£P)(Ch) = 0^2)(Ch) = °>1(Z)(^) = h(T), \/p G [1,°°] • 

Proof. Theorem 5.19 says that the algebra {Ch : h G £ l (Z)} is inverse-closed in 
the Banach algebra £%(£ P {Z)). The spectral identity o^^{C\f) = cr^i)(Ch) now 

follows from Lemma 5.10. Finally, <7^2) (Ch) = h(T) follows from (5.14). □ 

Summarizing, we may say that convolution operators obey a strong form of 
spectral invariance: Namely, the spectrum of a convolution operator is indepen¬ 
dent of the domain space £ P (Z). For a version on noncommutative groups, see 
Section 5.3.4. 

Symbolic Calculus. Wiener’s Lemma may be seen as a primitive form of a symbol 
calculus. 

Usually, by a symbol calculus we understand a mapping from functions to 
operators. To each function a is associated an operator Op (a). Then a is called 
the symbol of the operator. The mapping a —> Op (a) is assumed to be linear. A 
nice symbolic calculus satisfies some additional desirable properties. Whereas the 
pointwise product of functions is commutative, the composition of operators is 
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noncommutative in general; thus, the mapping a —► Op (a) usually fails to be an 
algebra homomorphism. However, for a “good” symbolic calculus or for a suitable 
class of “nice” symbols one can often show that the mapping is close to an alge¬ 
bra homomorphism by showing that Op (ab) — Op (a) Op (b) is small in some sense. 
In particular, if the operator Op (a) is invertible, then Op (a~ l ) is an approximate 
inverse. This idea is fundamental in the symbolic calculus for pseudodifferential 
operators. 

Wiener’s Lemma is the prototype of a symbolic calculus. In this case, we map 
a sequence h to the convolution operator Ch. The distinguished class of symbols is 
i 1 (Z). This symbolic calculus is particularly simple, because it is an algebra homo¬ 
morphism from £ l (Z) (with respect to convolution) to bounded operators on £ 2 (Z). 

The inverse of a convolution operator Ch is again a convolution operator 
Cjj" 1 = C g . If h G i l , then also gG^ 1 . Thus, Wiener’s Lemma shows that the inverse 
of a convolution operator possesses the same form; i.e., it is again a convolution 
operator. If the symbol of the operator is “nice” (in £ l ), then the symbol of the 
inverse is also “nice.” 
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Exercises for Section 5.2 


1. Let si be an algebra with a unit element e and a G si. If there exist b,c G si such 
that ac = e and ba = e , then a is invertible and b = c = a ~ l . 

2. Consider the algebra C(T) of continuous functions of period 1. Show that the 
spectrum of a function / G C(T) coincides with its range: <T ( c(t)(/) = /(T) = 
{/(*):* €T}. 

3. (a) Show that both /(f) = 2 + cos27Tf and g(f) = 1 — |2f — 11 (for t G [0,1] and 
extended with period 1) possess an absolutely convergent Fourier series. 

(b) Show that f a possesses an absolutely convergent Fourier series for every 
cc G M. 

(c) Show that g 1 / 2 does not have an absolutely convergent Fourier series. (Hint: 
Use integration by parts to estimate the Fourier coefficients of g 1 / 2 .) 

4. Let 0 < s < 1. We say that a function / on T is Holder continuous with exponent 
s if 

I/M-/(y)l <C\x-y\ S for all x,y G T . 


Let C s ( T) be the space of all Holder continuous functions with exponent s and 
the norm 


ll/llc* := II/IU+ sup 

x,yeT,xj^y 


l/(*)-/(y)l 

l*-y| 


(a) Show that C s is a Banach algebra contained in C(T). 

(b) Show that C s is inverse-closed in C(T). 

(c) Find a bound for the norm of l/f in C 5 (T) in terms of \\f\\c s - 

5. Let si p ( T) be the space of all absolutely convergent Fourier series with coeffi¬ 
cients in £ P (Z) for 0 < p < 1; i.e., a Fourier series f(t) = 'Zke’z a k e2mkt belongs 
to si p (T) whenever 5/ez \ a k\ p < °°- Endow si p (T) with the norm 


n/iu = Ikr=iiaii^. 

keZ 

(a) Show that si p satisfies all properties of a Banach algebra, except that the 
homogeneity of the norm is replaced by the property ||c/||^ = |c/||/||^ for 
c G C and / G si p (T). [Such an algebra is called a p-normed algebra.] 

(b) Show Wiener’s Lemma for si p (T) andO < p < 1: If / G si p ( T) and fit) / 0 
for all t G T, then 1 // G si p ( T). 

Hint: Verify and use the inequality \a + b\ p < \a\ p + \b\ p fora, b^C and 0 <p< i. 
Follow the proof of Wiener’s Lemma in Section 5.2.4. 

For the original statement and result, see [247]. 

6. Brandenburg’s trick [30]: Let si C SS be two Banach algebras with a common 
unit element. Assume for every a G si there exists a sequence c n = c n (a) > 0, 
such that lim^oo cj n = 1 and 

\\a 2n \U<c n \\a n \U\\a n \y. 

Show that si is inverse-closed in 38. Hint: Apply Hulanicki’s Lemma 5.11. 
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7. Let ^(T) be the space of all absolutely convergent Fourier series whose 
coefficients decay like \ak\ < C\k\~ s and norm 

ll/IL; = X kl + supk||*r. 

&GZ ke’L 

(a) Show that srf} (T) is a Banach algebra for any s > 0. 

(b) Show that \ T) is inverse-closed in C(T). 

Hint: Use that \k + l\ s < C s (\k\ s + \l\ s ) for all k,l G Z and prove that 

sup|(a*b)(k)| |fc| 5 <C f ||a||i sup|fe*| + ||b||i sup|a*| 

&GZ \ &GZ &GZ 

Now apply Brandenburg’s trick from Exercise 6. 
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5.3 Variations 

In this chapter we discuss a number of variations of Wiener’s Lemma. We will cover 
the following variations: 

• weighted versions of Wiener’s Lemma for Fourier series; 

• noncommutative versions for matrix algebras; 

• a version of Wiener’s Lemma for twisted convolution and its relation to 
noncommutative tori; 

• the analysis of the spectrum of convolution operators in certain non-Abelian 
locally compact groups; and 

• a symbolic calculus for pseudodifferential operators and their spectrum. 

We will develop each subject in parallel to the exposition of Section 5.2. 

• First, we will define the basic concepts and unravel a relevant Banach algebra. 

• Then we will formulate a version of Wiener’s Lemma and turn it into a statement 
of inverse-closedness between two Banach algebras. Although we will not be 
able to give the complete proofs in each case, we will explain the main ideas and 
establish the connection to the classical version of Wiener’s Lemma. 

• Finally, we will make explicit the consequences for spectral invariance. 

Since each subsection draws material from a different field of mathematics, the 
exposition is not always self-contained. Our goal is to provide a synthetic and 
unifying view of related topics in apparently unrelated fields. 


5.3.1 Weighted Versions of Wiener’s Lemma 


When considering Fourier series, the ^-condition on the coefficients guarantees 
that the series converges absolutely. In particular, the partial sums converge in the 
supremum norm. To obtain faster convergence of the partial sums, it is natural to 
impose decay conditions on the coefficients. This is done with weight functions. 

In general, a weight is simply a nonnegative function. To consider weighted 
absolutely convergent Fourier series, we use the following definition. From now on, 
we work with multivariate Fourier series f(t) = ^LkeZ d a k e 2711 ^ for t = (t \,..., tf) G 
and replace the index set Z by if. 

A weight v on Z d is called submultiplicative if 

v(fc + Z) < v(k)v(l) fork,lEZ d . (5.15) 

For simplicity we consider only symmetric weights satisfying v(—k) = v(k). 
Associated to each weight function on Z d is the weighted £ l -space t\ defined by 
the norm 


ll a ll£j = IM|i = £ WkHk). 


(5.16) 
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In analogy to si (T J ), we define the weighted absolutely convergent Fourier series 
as follows: We say that / G si v (T d ) if f(t) = ^Lkez d a k e2niht with norm 

ll/lk = N 4 . 

Example 5.21. The typical submultiplicative weights on Z d are of the form 

v(k)=e a W b {l + \k\y, kez d , 


for a, s > 0 and 0 < b < 1. 

Weights are useful for the study of convergence properties of Fourier series. 
They can be seen as a parameter for the rate of convergence of the partial sums. 
For simplicity take an absolutely convergent Fourier series of one variable f(t) = 
Hkez a k e2nikt £ M(T) and let Suf(t) = Y,\k\<N a k e2nikt he the Mh partial sum of /. 
Then 


II/-WII. 


= || X a k e 2 « ikt ||„ 

|fc|>Af 

< Yj I a k \v(k)v(k)~ l 
\k\>N 


sup v(k) 1 

X \a k \v(k) 

1*1 >N ) 

\ 

\k\>N 

sup v(fc) -1 
\k\>N ) 

)\\f\U- 


For increasing weight v, such as the standard weights in Example 5.21, we have 
sup|£|>^y v(k) —1 < v(A0" 1 , an( i thus the partial sums Swf converge to / at the rate 
v(N)~ l . The precise connection between the approximability by trigonometric poly¬ 
nomials and the decay of the Fourier coefficients is treated in approximation theory. 
See, for instance, [66]. 

The following simple lemma explains why we need the above conditions on the 
weight function v. 

Lemma 5.22. 1. If v is submultiplicative, then si v (T d ) is a Banach algebra with 
respect to pointwise multiplication. 

2. If in addition, v is symmetric, then complex conjugation f —> / is an isometry 
on si v . 


Proof. The proof is almost identical to the proof of Lemma 5.4. Fix a,b G £l(Z d ). 
Since v(n) <v(k)v(n — k) by the submultiplicativity of v, we have 




I 

ne Z 


ttkbn—k 

kCiZi 


v(n) 


< X X \ a k\\K-k\v(k)v(n-k) 

keZneZ 


= Ha|| 4 ||b|| 4 . 
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Now let f,g G with coefficient sequences a and b G il(Z d ). Then 
11/^IUv = l|a*b|| 4 < ||a|| 4 ||b|| 4 = ||/|U V ||^|U v 

and g/ v (T d ) is a Banach algebra. Item (2) is obvious. □ 

At this point arises the natural question of whether Wiener’s Lemma holds for the 
weighted absolutely convergent series £/ v (T d ). In other words, suppose we know 
that / G £/ v (T d ) and f(t) 0 for all t G T d . What can we say about 1//? 

Though this seems a harmless variation that is typical for the mathematical mind, 
the answer to this question has led to a new idea in Banach algebra theory. A striking 
result of Gel’fand, Raikov, and Shilov characterizes all weights for which Wiener’s 
Lemma remains true [102]. 

Before we state their fundamental result, we need a new concept about weights. 

Definition 5.23. A submultiplicative weight v is said to satisfy the GRS condition 
(Gel’fand-Raikov-Shilov condition) if 


lim v(nk) 1 /" = 1, VkeZ' 3 '. (5.17) 

rc—>00 

(When dealing with the GRS condition and only in this context, we write integer 
vectors in boldface as k G Z d to distinguish them from positive integers.) 

The limit in (5.17) exists always because v is submultiplicative. If v is symmetric, 
then v(k) > 1 for all k G Z d and so always lim^oo v(nk) l / n > 1. 

Considering the standard weight functions v(k) = e a l k l (1 + |k|) 5 , we see imme¬ 
diately that v satisfies the GRS condition if and only if 0 < b < 1. On the other hand, 
if b = 1 and v(k) = e a for a > 0 , then obviously v(nk) l / n = e a > 1 for all n and 
k 7 ^ 0, and thus the exponential weight violates the GRS condition. 

In fact, exponential growth in some direction is the only reason why the GRS 
condition may fail. Assume that the weight v violates the GRS condition. Then 
there exist a k G Z d and a > 0 such that 

v(ftk) 1 ^ > e a > 1 iovn>No. 

Thus, v(nk) > e an and the weight v grows exponentially along the subgroup kZ. 

To summarize, the GRS condition is a precise technical condition that excludes 
the exponential growth of a weight. 

We can now formulate the weighted version of Wiener’s lemma. 

Theorem 5.24. Let v be a submultiplicative weight on Z d . If v satisfies the GRS 
condition, then Wiener’s Lemma holds for &Z V : If f G &/ v (Y d ) and f(t ) 7 ^ 0 for all 
te T d , then l/fe^/ v (T d ). 

Proof We follow the proof of Wiener’s Lemma in Section 5.2.4 and make the nec¬ 
essary modifications when the weight occurs. 

As in Steps 1 and 2, we may assume that h G T) and 0 < h(t) < 1 — 8 for 
some 8 > 0. We then have to show that the geometric series converges in 

«g^(T), and not just in C(T). 
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Now choose a trigonometric polynomial p (of degree N in each variable) such 
that \\f p\\&/ v <£. Set r = h — p, and then ||r||oo < ||r||^ < £. 

The main modification is the following estimate for the comparison of various 
norms: If q is a trigonometric polynomial of degree N in each of the d variables, 
then—writing \k\oo = max J= i ^ \kj\ for the maximum norm on R d —we obtain 

lklUv= X \ b k\v(k) 

\k\oo<N 

< \\b \\2 (2N + l) d ^ 2 max v(k) (5.18) 

^ ' \k\o.<N 

= IMh (21V + l) d/2 maX N v(k) 

< IlglL (2N+l) d / 2 max v(k). 

V ' |fc|oo<iV 

For further use, we set v(n) = maxm< n v(fc) and formulate the properties of v as a 
sublemma [89]. 

Lemma 5.25. The weight v is submultiplicative and increasing on N. If v satisfies 
the GRS condition, then v also satisfies the GRS condition. 

The proof is elementary, but not instructive. For completeness we will give it at the 
end of this section. 

To estimate the ^-norm of h n , we use the binomial theorem and obtain as in 
Step 5 of Section 5.2.4 

FIU < X (") WAUVAU < X (") \\AU *?-' ■ 

By (5.18) we have 

\\p l \U<\\p l \\~(2Nl + l) d ' 2 v(M). 

Therefore, the complete estimate for the i^-norm of h n becomes 

mU < {2Nn+l) d ' 2 ± (^je n ~ l \\ptv{lN) 

= (2Nn+ \) d / 2 v{nN) (||p||oo + e)" 

<(2Nn+l) d/2 v(nN ) {\\h - r\\^ + e) n 
< (2Nn + l) d/2 v{nN){l -8 + 2e) n . (5.19) 

Finally, the ^-norm of the geometric series Y,h n is majorized by 

£ W hn \U < j:(2Nn+l) d / 2 v(Nn) (l-8 + 2e) n . 

n =0 n =0 


(5.20) 
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This series converges provided that 

1 > limsup ((2Nn+l) d/2 v(Nn) (1 - 8+ 2e) n ) ^ = limsupv(7V«) 1/ "(l -8 + 2e). 

' ' n—>°° 

This is the case, because v satisfies the GRS condition by Lemma 5.25. □ 


Remark 5.26. The expression in (5.20) offers some insight into the nature of the 
GRS condition. The radius of convergence of the power series $v(nN)z n is 
exactly (limsup^^ v(nN) l / n ) . For this series to converge for all z, \z\ < 1, we 
need (limsup^^ v(nN) V") 1 > 1. This is exactly the GRS condition for v. 


Corollary 5.27. The algebra &/ v (T d ) is inverse-closed in C(T d ) if and only if the 
weight v satisfies the GRS condition. 

Proof. The sufficiency of the GRS condition is the content of Theorem 5.24. 

To show the necessity of the GRS condition, we assume that v violates this 
condition and give a counterexample to Wiener’s Lemma. The following exam¬ 
ple illustrates the nature of the GRS condition and will return in several further 
variations. 

If v violates the GRS condition, then there are keZ d and a > 0, and no £ N such 
that 

v(nk) > e an for n> no. 


Now fix 8 G (0 ,a\ and set 

f{t) = l-e~ 5 e 2nik - t e^ v (T d ). 

Then |/(?) | > 1 — e~ s > 0 and thus /(f) / 0 for all t € T d . Furthermore, the inverse 
of / is given by the trigonometric series 


75 ) =(1 -"~ 


= 2 > 

n =0 


(5.21) 


Calculating the norm of 1//, we find that 


1 

7 




£ e~ Sn v(nk) > £ e~ 8n e an = ~. 

n =0 n=no 


Thus, 1// does not belong to and Wiener’s Lemma does not hold in &/ v . □ 

The GRS condition is ubiquitous in the investigation of inverse-closed sub¬ 
algebras with weights. This condition draws the fine line between exponential and 
subexponential growth. Exponential growth is special and usually implies the exis¬ 
tence of some analytic structure. 
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What happens in the case of exponential weights? If v a (k) = e a ^ is an 
exponential weight with growth constant a , then &f Va (T d ) is still a Banach algebra 
with respect to pointwise multiplication (Lemma 5.22). One can show the follow¬ 
ing weak version of Wiener’s Lemma for exponential weights. If / G ^ a (T) and 
f(t) 7 ^ 0 for all t G T d , then there exists 8 > 0, <5 < a, such that 1// G £/ Vs (T d ). 
Thus, the Fourier coefficients of the inverse still have exponential decay, but the 
growth constant 8 may be arbitrarily small and depends on /. This is also shown in 
our counterexample, especially (5.21). 

Proof (of Sublemma 5.25). Set v(n) = max^^ v(k). Then the weight v is submul- 
tiplicative and increasing on N. If v satisfies the GRS condition, then v also satisfies 
the GRS condition. 

The submultiplicativity and monotonicity are clear; we only show the GRS 
condition for v. Let e/, j =» 1,..., d, be the standard basis for R d and let vj (/) = v(Zey) 
be the restriction of v to the subgroup {0} x • • • x {0} x Z x {0} x • • • x {0}. Then 



v(k)=v y kje, < ]“[ Vj(kj). 


Now assume that the GRS condition is not satisfied for v; then for some N G N and 
a > 0 we have v(nN) > e an for n large enough. Consequently, there exists a sequence 
k n G Z d such that \k n \oo <nN and 


v(k n ) = v(nN) > e an . 


Since e an < v(k„) < riy = i Vj(k, L j), there exist a coordinate jo and a subsequence n r 
of N such that, with £ r = \k nr j 0 \, 


e an r /d < Vj 0 (£ r ) and \& r \ < n r N. 


For this subsequence we obtain 



(£ r ) > e aYlr l d > e a ^/(dN) 


and 


lim v(£ r ej o y/ er > > 1 


in contradiction to the assumed GRS condition of v. □ 


5.3.2 Matrix Algebras 


In Section 5.2.6 we argued that a discrete time-invariant channel is modeled by 
a convolution operator and discussed the relevance of Wiener’s Lemma for the 
analysis of the input-output relationship. In this section we study time-varying 
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channels and the corresponding mathematical model, namely matrix algebras. We 
present a version of Wiener’s Lemma in this context. 


5.3.2.1 Discrete Time-Varying Channels 

Recall that a time-invariant system corresponds to a convolution operator 

(C b a)(k) = Y,h(k-l) a (l). 

k(EZ 

The infinite matrix M corresponding to this linear operator has the entries 
rriki = h(k — l). The system matrix is thus constant along diagonals; such a matrix is 
usually called a Toeplitz matrix. 

When we deal with time-varying systems, the corresponding matrix will no 
longer be constant along diagonals, but for a slowly time-varying system, it will 
have small variations along the diagonals and will still be close to a convolution 
operator. 

If M is a matrix over the index set Z> d with entries mki,k,l G Z d , then the Zth 
diagonal has the entries m^k-i • We may write the matrix-vector multiplication in a 
way that resembles a convolution, namely, 

(■ Ma)(k ) = y m u ai = m k)k _ia k _i. 

lez d lez d 

If M is a Toeplitz matrix, then this is a convolution. If M is “almost constant” along 
diagonals, the action of M resembles a convolution. 

This observation motivated Gohberg, Kaashoek, and Woerdeman [104] to intro¬ 
duce a nonstationary Wiener algebra. It is defined as the class of all matrices for 
which the norm 

||Af |W = X SU P \ m k,k-l\ 

iez d keZ d 

is finite. This class of matrices was studied simultaneously and independently by 
Baskakov [14] and Kurbatov [155], and was later rediscovered by Sjostrand [214]. 
It is often named after the inventors as the Baskakov-Gohberg-Sjostrand matrix 
algebra. This class of matrices has recently appeared in several applications in frame 
theory [10, 94, 109, 119] and in the analysis of the finite section method in numer¬ 
ical analysis [121, 198]. 

Let us look in more detail at the class . The number 

d(l)= sup \m^k-i\ 
k£Z d 

is the supremum of the Zth diagonal of M and so 

\mki\<d(k — Z), kj G 7L d . 


(5.22) 
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Hence, the action of M on a vector c can be estimated as follows: 


|(Afc)(*)| 


X m kl°l ^ X d(k-l)\ci\ = (d*|c|)(fc). 

lez d lez d 


(5.23) 


This inequality says that the action of M is dominated (pointwise) by the convo¬ 
lution with the sequence d G £ l (Z d ). This explains our notation; ^ is the class of 
convolution-dominated matrices. Young’s inequality (Lemma 5.17) implies that a 
matrix M G ^ is bounded on every £ p (Z d ), 1 < p < °°. 

Next let us consider a weighted version of convolution-dominated matrices. Let 
v be a submultiplicative weight on Z d . Then contains all matrices for which the 
norm 

\\ M \W V = X sup \m Kk -i\v{l) 
iez d kez d 

is finite. As above, this means that M is dominated by a convolution with an 
£\ -sequence. In particular, the matrix of a convolution operator Ch with symbol 
h G £l(% d ) belongs to < € v . Moreover, identifying the operator with its matrix, we 
have 

l|Chlk-||h|| 4 . 

Lemma 5.28. Ifv is submultiplicative and symmetric, then is a Banach *-algebra 
with respect to matrix multiplication and taking the adjoint matrix as the involution. 

Proof. Again, the proof is similar to the one of Lemmas 5.4 and 5.22 and in fact 
makes direct use of Lemma 5.22. Let M,iV G and set d(l) = sup keZ d \ tn^k-i I and 
e(l) = sup keZ d \ntk-i\- Then by (5.22) 

| (MN)k,k-l | = | X m krn r ,k-l\ 

r<EZ d 

< ^ d(k — r)e{r — k + l) = (d*e)(/) 

rC_7L d 


and 

\\ MN \W V = X sup \(MN) k)k _,\v(l) 
iez d kez d 

< X (d*e)(Z)v(/) = ||d*e || 4 
lez d 

<||d|| 4 ||e || 4 = ||M|k|k|k. 

For the involution M —* M* we find 

\\ M *\Wv = X sup|(M* )*,*-/ |v(/) = X sup \m k -w\v(l) 
iez d kez d i e z d kez d 

= X sup \m Kk+ i\v(-l) = \\M\\<g v . 

i e z d ktZ d 


So the involution is an isometry. □ 
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53.2.2 A Nonstationary Version of Wiener’s Lemma 

For convolution-dominated matrices we may now pose the same questions as for 
convolution operators in Section 5.2.6. The question is now whether the inverse of 
a convolution-dominated matrix is again convolution-dominated. In the context of 
time-varying systems we are interested in a meaningful input-output relationship. 
So if M is convolution-dominated and the output y = Me is in £ p , is it true that the 
input c was also in £ P 1 As for convolution operators in Section 5.2.6, a satisfactory 
answer requires the independence of the spectrum of M of the domain space £ p (Z d ). 

After the discussion of several versions of Wiener’s Lemma, it is perhaps no 
longer surprising that the answers to these questions will be the same as for 
time-invariant systems (convolution operators). The techniques and proofs, how¬ 
ever, are significantly more involved, because the matrix algebra ^ is highly 
noncommutative. 

Theorem 5.29. IfM G andM is invertible on £ 2 (Z d ), then M~ l G c &. 

This result was obtained independently and almost simultaneously by Gohberg, 
Kaashoek, and Woerdemann [104], by Baskakov [14] and Kurbatov [155], and a 
little later by Sjostrand [214] with a completely different proof. 

As with the classical Wiener’s Lemma, we next consider a variation of 
Theorem 5.29 with weights. The weighted version is due to Baskakov, who stud¬ 
ied convolution-dominated operators on Banach spaces with unconditional (block) 
bases. The following theorem is often referred to as Baskakov’s theorem [14]. 

Theorem 5.30 (Baskakov [14]). Assume that v satisfies the GRS condition and 1 < 
p < °°. IfM G and M is invertible on £ p (Z d ), then M~ l G In other words, 
is inverse-closed in 3$(£ p ) for all p G [1,°°]. 

We cannot give the complete proof of this theorem, but will sketch the proof 
idea at the end of this section. In particular, we will elaborate the relationship of 
Theorems 5.29 and 5.30 with the classical Wiener’s Lemma. 

As with convolution operators, the GRS condition characterizes those weights 
for which Wiener’s Lemma holds. 

Corollary 5.31. The algebra is inverse-closed in &(£ 2 ) if and only ifv satisfies 
the GRS condition. 

This is just a reformulation of Theorem 5.30. The hard part is to show that the 
GRS condition is sufficient for the inverse-closedness of ^ v . 

To verify the necessity of the GRS condition, we return to the example in the 
proof of Corollary 5.27. If v violates the GRS conditions, then there are k G 7L d and 
a > 0 such that v(nk) > e an for n> no. 

We construct an invertible convolution operator Ch with h G £\ with C^ 1 0 ^v 
Set h{ 0) = 1 and h{ k) = e~ 5 and h(l) = 0 for l 0 and / k and consider the 
convolution operator Ch. Since h(f) = 1 — e - d e 2nikt fz 0 for all t G T d , the operator 
Ch is invertible on £ 2 (Z d ) by Corollary 5.20. Its inverse is the operator C g = C^ 1 
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with g (t) = h(r) _1 = ^ =0 e~ 8n e 2mnk ' t • Since g 0 £l(Z d ), the matrix of C g is not in 
^v. So is not inverse-closed in £$(£ 2 ). 

Further variations on convolution-dominated matrices, including a complete 
characterization of inverse-closed Banach algebras of convolution dominated 
matrices, can be found in [120]. For a recent extension to noncommutative groups 
as index sets, see [90]. 


5.3.2.3 Spectral Invariance 

We next formulate Baskakov’s theorem as a statement about the spectral invariance 
of matrices. 

We first note that, like convolution operators, a matrix in is not only bounded 
on £ 2 (Z d ), but acts on a whole class of weighted ^-spaces (and other spaces 
as well). For this we introduce another class of weight functions. We say that a 
nonnegative function m on 7L d is v-moderate if 

m(k + l) < Cv(k)m(/), foralU,/eZ J . (5.24) 

A weight is called moderate if it is v-moderate with respect to some submultiplica- 
tive weight v. 

Let the weighted ^-space £m(Z d ) be defined by the norm ||c||^ = ||cm|| p . 

The relevance of moderate weight functions is explained by the next lemma. 

Lemma 5.32. Let v be a submultiplicative weight on Z> d . 

1. Then £m is invariant under all translations Tj^k E lL d , if and only ifm is moderate. 

2. If m is v-moderate, then £\^£m^ £m>' he., if a E £\ and b E £m, then a*bEfi 
with the convolution estimate 


l|a*b||^<C||a|| 4 ||b||,p. (5.25) 

3. IfM E and m is v-moderate, then M is bounded on every £mfor 1 < p < °°, 
and 

where C is the constant in (5.24). 

Proof. Items (1) and (2) are elementary and left to the reader (see Exercises). In 
fact, (2) is just a modified version of Young’s inequality (Lemma 5.17). 

Since \(Mc)(k)\ < (d* |c|)(fc) by (5.23), the weighted Young inequality (5.25) 
yields 

\\Mc\\ e P m <C||d|| 4 ||c||^=C||M||^||:c|| 4 . □ 

We may reformulate (3) by saying that the matrix algebra is continuously 
embedded in the algebra 3§{£m) of bounded operators on £&. Given a matrix M 
that is bounded on £ v m , we denote its spectrum as an operator on £^ by (M). 
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We can now formulate the spectral invariance for convolution-dominated 
matrices. 

Corollary 5.33. Assume that v satisfies the GRS condition and M G ^ v . Then the 
spectral invariance 

holds for every p G [1,°°] and every v-moderate weight m. 

Proof. The statement follows from a weighted version of Baskakov’s theorem 
(Theorem 5.30). In fact, Baskakov’s general result implies that is inverse-closed 
in 3§{fm) for 1 < p <oo and every v-moderate weight m. Now Lemma 5.10 implies 
that (W)(M) = for all M G C S > V . For p = oo we use duality and find that 

g @(££)(M) = )(M*) = <7^ v (M*) = o<g v (M). Thus, the spectrum is indepen¬ 

dent of the domain space □ 

Corollary 5.33 says that the spectrum of a matrix M is independent of the domain 
space provided that the matrix has sufficient off-diagonal decay. We may also state 
the corollary in the style of Theorem 5.19 as follows: A matrix MG^ is invertible 
on l 2 (Z d ) if and only if M is invertible on £m(Z d ) for some p G [1,°°], and some 
v-moderate weight m if and only if M is invertible on all £m(Z d ) for all p G [1,°°] 
and all v-moderate weights m. 

This result is analogous to Corollary 5.20 for convolution operators. However, 
whereas convolution operators on ifi form a commutative algebra of operators, 
is a highly noncommutative matrix algebra. 

For some recent results in the line of Baskakov’s theorem, see [2, 211]. 


5.3.2.4 The Idea of the Proof of Baskakov’s Theorem 

Although we cannot give a complete proof of Baskakov’s theorem, we will indicate 
the main ideas. Our goal is to show how the classical version of Wiener’s Lemma 
for absolutely convergent Fourier series enters the field of matrix algebras. 

It is tempting to imitate the proof of the classical Wiener’s Lemma (Theorem 5.5) 
in the noncommutative setting. Though many of the steps carry over, the proof can¬ 
not be saved because the binomial theorem is not applicable in noncommutative 
algebras. 

Following deLeeuw [63], we first associate to every matrix A a matrix-valued 
function. Define modulation M t ,t Gl rf , acting on a sequence c by 

(M t c) (k) = e 2nik ' t c(k ) for k eZ d . 

Given a matrix A, we next consider the matrix-valued function 


f (t)=M t AM- t . 
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Clearly, M t AM- t is periodic with period 1 in each coordinate of t = E 

and is a continuous matrix-valued function on T d . Instinctively, the first question to 
ask is, what are the Fourier coefficients of f? 

Lemma 5.34 (Fourier coefficients of f). The nth Fourier coefficient is the nth side 
diagonal of A. Precisely, let D n he the matrix with entries (D n )^k- n = a k ^- n and 
( D n )ki = Ofor l f^k — n, where k,l,n E T* d . Then 

[ f (t)e- 2Kin ‘ t dt = D n . (5.26) 

J[of d 

Proof The integral in (5.26) is a matrix-valued integral; we interpret it entry wise. 
First note that 


{M t AM- t c) (k) = e 2niht ]T a ki e 2mlt ci 

lez d 

= I a^-^d. 

lez d 

The matrix f(t) = M t AM- t has the entries a k ie 2ni ^ k ~ l ^' t . Therefore, the nth Fourier 
coefficient of the (k, l )th entry is 


?(«)«=/ f (t) kl e- 2 * in -’dt 

= f a k ie 2ni ^ k ~ l ^' t e~ 2nin '‘ dt 

J[0,l] d 

= GklSk-l-n = a kl-n 8k-l-n j 


and so f (n) = D n . □ 

Therefore, the matrix-valued function f(t) possesses the formal Fourier series 
f(t) = ^Z ne z d 2mk ' t . In particular, for t = 0 we recover A = ^ n eZ d Dn as a sum of 
its diagonals. As always with Fourier series, we must be cautious in which sense the 
Fourier series represents the given function. 

This question is easy to answer for convolution-dominated matrices. 
Recall that 


WMW = Yj SU P\ m k,k-n\ 

nezkez 

= X \\Dn\\iP^£P • 

nC_Z 

Thus, the Fourier series of f converges in the operator norm, and f possesses an 
absolutely convergent Fourier series. The difference is that the coefficients of f are 
infinite matrices (or operators). 

Let us go a step further and introduce the space of absolutely convergent matrix¬ 
valued Fourier series &/ v (T d A matrix-valued function f belongs to 
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g/ v (Y d ,3${£ p )) if it has a Fourier series f (t) = Y.keZ d Ak elnikt with coefficients 
Ak G &(£ p ) satisfying 


11*11 £? v (T d ,&(£p)) X \\Ak\\tp^ep v(k) <o°. 
ktZ d 

Using this notation, we quickly obtain the following. 

Lemma 5.35. 1. A matrix A belongs to *6- y v if and only if the matrix-valued function 
f belongs to &/ v (T d ,£%(£ p )). In this case, 

IWk = Wstf v (T d ,&(£P)) • 

2. Assume that Wiener’s Lemma holds for &/ v (Y d ,£$(£ p )). [This means that if 
f G gf v (Y d ,£%(£ p )) and f(t) is invertible on £ p (Z d ) for all t G T d , then f _1 is 
in &/ v (T d ,£%(£ p )).] Then %; is inverse-closed in £%(£ p ). 

Proof (1) follows directly from the definitions. 

(2): If A e ffy C g$(£ p ) is invertible, then the operator-valued function associ¬ 
ated to its inverse is just M t A~ l M- t = f(Y) _1 , and f (t) is invertible on £ p (Z d ) for 
all t G T d . Wiener’s Lemma for g/ v (T d ,h(£ p )) implies that f _1 G &/ v (T d ,&(£ p )). 
Consequently, by(l),A _1 G^ v - □ 

Lemma 5.35 establishes a surprisingly direct connection between the topic 
of matrix algebras and the classical formulation of Wiener’s Lemma. To under¬ 
stand the inverse of a convolution-dominated matrix Ag^ v , we need a version 
of Wiener’s Lemma for operator-valued Fourier series. Such a generalization of 
Wiener’s Lemma exists indeed and was obtained by Bochner and Phillips [24] 
already in 1946. Though this generalization is “only” from scalar-valued func¬ 
tions to matrix-valued functions, it is highly nontrivial, because the algebra of 
operator-valued absolutely convergent Fourier series is noncommutative. Therefore, 
neither Gel’fand theory nor the elegant arguments of Newman and Hulanicki used in 
Section 5.2.4 can be applied. The proof of the operator-valued version of Wiener’s 
Lemma requires several results from noncommutative Banach algebras and their 
representations. For the details we refer to the original sources [14, 24] or the 
appendix of [120]. 

Our main point was to reveal the connection of the nonstationary version of 
Wiener’s Lemma to its classical version. 


53 . 2.5 Off-Diagonal Decay of Matrices 

So far we have discussed convolution-dominated matrices in analogy to absolutely 
convergent Fourier series. In applications, many other conditions are used to mea¬ 
sure off-diagonal decay. We mention just a few of them. 

(a) Strict off-diagonal decay is measured by the norm 


sup \m k i\v(k-l). 
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The weight function v measures the rate of off-diagonal decay. The typical weights 
are the polynomial weights v(k) = (1 + \k\) s or a combination of polynomial and 
subexponential weights v(k) = (1 + \k\) s e a ^ , s > d, a > 0, and 0 < b < 1. 

(b) Schur-type conditions: In imitation of Schur’s test, which involves the column 
sums and row sums of a matrix, we may also study the norm 

\\M\\rfi = max| sup \m k i\v(k-l), sup \m u \v(k -1)\. (5.27) 

l l<E% d keZ d k£Z d i e zd 

If v is submultiplicative on Z d , then is a Banach algebra with respect to matrix 
multiplication (see Exercises). 

We note that the matrix algebras defined by off-diagonal conditions obey the 
following inclusion relations: 


(5.28) 

As in the case of convolution-dominated matrices, the off-diagonal decay is 
preserved under suitable conditions on the weight involved. We quote a typical 
theorem from [119] and refer also to the work of Baskakov [14-16] and Sun [221]. 

Theorem 5.36. 1. Assume that v -1 G ^(Z), v -1 * v -1 < Cv~ l , and v satisfies the 
GRS condition. Then is inverse-closed in &(£ 2 ) . 

2. Assume that v is submultiplicative, v(k) > (1 + |k|) 5 for some 8 > 0, and v 
satisfies the GRS condition. Then is inverse-closed in 

Again, the GRS condition is necessary and sufficient for the validity of 
Theorem 5.36. The necessity follows from the counterexample constructed after 
Corollary 5.31. 


5.3.3 Absolutely Convergent Series of Time-Frequency Shifts 

We now turn to time-frequency analysis and noncommutative geometry. 

5.3.3.1 The Basic Definitions 

Recall the definition of time-frequency shifts. Given x, ^ eR d , the translation oper¬ 
ator or time shift T x and the modulation or frequency shift M^ act on a function / 
on R d by 

T x f{t)=f(t-x) and M^f(t) = e 2ni ^f{t), x,^,tGR d . 

Combining the parameter v and ^ into a single point z «= (x, t, ) G R 2d in the time- 
frequency “plane,” their composition is the time-frequency shift 

n(z)f(t)=M^T x f(t) = e 2Ki ^f(t-x), t eR d . 


(5.29) 
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The time-frequency shifts /r(z), zG M? d , are unitary operators on L 2 (M rf ) and 
isometries on Z/(M^) for p ^ 2. The translations and modulations satisfy the com¬ 
mutation relations 

T X M § = e~ 2nix '^ M^T X . 

As a consequence the composition of two time-frequency shifts k(z) and k(w) with 
w = (wi,W 2 ) G M 2J and z = (zi,Z2) £ is 

7r(w)/r(z) = C M W2 T WI )(M Z2 T ZI ) (5.30) 

= e~ 2niwvZ2 M W2+Z2 T Wl+Zl = e~ lKiwvZ2 K(w + z) ■ 

Thus, the composition of time-frequency shifts is a multiple of a time-frequency 
shift. However, since the 7t(z),z G M 2 ^, do not commute, the mathematics of 
time-frequency shifts always leads to noncommutative structures. 

Next we fix a lattice A in M? d . A lattice is a discrete subgroup A C M? d with 
compact quotient R 2d /A. Choosing a basis a ; -,y = 1,... ,2d, we can write every 
A G A as A = X 2 ii kj2Lj with integer coefficients kj e z. Consequently, every lattice 
can be represented as A = Al? d for some invertible 2 d x 2^-matrix A, the columns 
of which are just the basis vectors a,. 


5.3.3.2 The Rotation Algebra 

In time-frequency analysis and in noncommutative geometry one considers formal 
sums of time-frequency shifts on a lattice A, i.e., operators of the form 


AeA 

We should think of sums of time-frequency shifts as a noncommutative analogue 
of Fourier series. The complex exponentials e 2nikd are replaced by the unitary 
operators 7t(A). Whereas a Fourier series is a function on the torus T d , a sum of 
time-frequency shifts on a lattice A yields an operator. The structure and prop¬ 
erties of function spaces (and algebras) on the torus T d completely describe the 
(topological) properties of T d (this is the content of the theorem of Gel’fand- 
Naimark [27, 56, 150, 209]). By analogy the sums of time-frequency shifts are inter¬ 
preted as “functions” on some “exotic” structure. Since by (5.30) the time-frequency 
shifts 7t(A) do not commute for a general lattice, this structure is taken to be a 
noncommutative torus. 

Keeping the analogy between Fourier series and time-frequency shifts in mind, 
it is now time to make some precise definitions. 

Definition 5.37. Let ^(A) be the vector space of all finite linear combinations of 
time-frequency shifts 7t(A). The rotation algebra or noncommutative torus C*(A) 
is the closure of M)(A) in the operator norm on L 2 (M J ). 

By definition C*(A) is a closed subspace of &(L 2 (R d )), the algebra of all 
bounded operators on L 2 (M J ). An operator A G 3${L 2 ) belongs to C*(A) if and only 
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if for every £ > 0 there exists a finite linear combination P = Y,x^a c X 71 ) L M) (A) 

with suppc finite, such that ||A — P\\ L 2^ L 2 < £. 

Let us collect some basic properties of series of time-frequency shifts to ensure 
that such series are well defined. 

Lemma 5.38. 1. If the sequence c = (ca)a<eA grows polynomially, \cx\ = 6(\ + 
|A|^) /or some N > 0, then the sum XagA c X k (^) a well-defined continuous 
operator from the Schwartz class S^(R d ) to S^'(R d ). 

2. Strong linear independence: IfJhxeA c X n (^) = 0/or a sequence c of polynomial 
growth, then c = 0. 

Proof 1. We use the following property of time-frequency shifts ([108, Thm. 
11.2.5] or [92]): If /, g E y(R d ), then the function z E R 2d —> {n(z)f,g) belongs to 
y(R 2d ) and depends continuously on / and g. In particular, for all M > 0, 

i(^(A)/,g)i = ^((i+iAir M ). 

Consequently, Kl^eA c x n{X)f,g)\ < Xasa \ c l\ l(rc(A)/,g)| is well defined and 
ZxeA Q,tt(A) makes sense as an operator from ^(K d ) to 

2. The proof of the linear independence is taken from [113]. By assumption we 
have, for all g, h E < 5 ? (M J ) and z E R 2d , 

X c x <*r(A)JE(z)g,jr(z)A) = 0. 

AeA 


Now/r(z) 1 7 t(A)7t(z) = ^ 27ri [^]^(^), where [z,A] = zi -A 2 — Z 2 *Ai is the symplectic 
form on R 2d . This implies that 

X c x (;c(A)g,A) = 0 (5.31) 

AeA 

for all z E R 2d and all g, h E ^(M rf ). 

Equation (5.31) is an absolutely convergent Fourier series on R ld /A. Since it 
vanishes everywhere, we must have 

cx {n (A)g,/z) =0 for all A E A , 
from which we deduce that cx = 0 for all A. □ 

Lemma 5.38 guarantees that formal series of time-frequency shifts with poly¬ 
nomially growing coefficients are always well defined in a distributional sense. In 
particular, every A E C*(A) possesses a unique expansion A = XagA axn(A). H 
can be shown that the coefficient sequence must be in I 2 [A). However, not every 
c E £ 2 (A) defines an operator in &(L 2 (R d )) X> C*(A). See [147] for details. 

To pursue the analogy between Fourier series and series of time-frequency 
shifts further, we next study the composition of sums of time-frequency shifts. 
Let A = XagA a X tt(A) E M)(A) and# = XgeA b^n^) E M)(A) be two finite sums 
of time-frequency shifts. We now mimic the calculation in Lemma 5.4 and see what 
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we get. Using (5.30), we have 

AB= \ X a x n{X) ) ( ]T b^n^U 

Vie A / \jueA / 

= X (5.32) 

A 7i LieA 

= I f 

veA \AeA / 

Except for the phase factor £- 27n ^r(V 2 -^ 2 ) ? the coefficient sequence of AB looks like 
the convolution of the sequence a and b. Thus, we make the following definition. 

Definition 5.39. The twisted convolution \\ of two (finite) sequences a and b over 
A is defined by 

(a t| A b)(v) = X a x b v _ x e- 2niX ^-^ . (5.33) 

AeA 

Remark 5.40. 1. If A = Z 2J , then the phase factor disappears and a \\ Z 2 d c = a * c is 
just the ordinary convolution and thus commutative. 

2. However, in general, the twisted convolution tu is not commutative. 

3. By pulling in absolute values, we have 

|(alUb)(v)|< X \a x \ \b v _ x \ = (|a| * |b|)(v) 

A- eA 

for all v G A. We may therefore apply Young’s inequality (Lemma 5.17) and 
obtain 

||atub|| p < ||a||i ||b|| p (5.34) 

whenever aG^(A) and b G £ P (A), l < p <°°. Thus, convolution inequalities 
for the ordinary convolution imply immediately analogous inequalities for the 
twisted convolution, and \\\ is well defined on many sequence spaces. 

Proposition 5.41. The subspace C*(A) is a C*-subalgebra of SS{L 2 {R d )). 

Proof. If A,5 g (A), then by (5.32) AB is again a finite linear combination of 
time-frequency shifts and thus AB G 

Next note that n{zf = (M^T X )* = T_ X M_% = e~ 2nix ^M_(T- x = e~ 2nix ^n(-z). 
IfA = I A ^^(A) G M)(A), then the adjoint operator A* is 

A* = y axe- 2niXv ^it{-X) £4(A). 

AeA 

As a consequence, ^(A) is a *-subalgebra of ^*(L 2 (M J )). Hence, its closure in the 
operator norm C*(A) is a C*-subalgebra of &(L 2 (R d )). □ 
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We can now go back to Section 5.2 and imitate all the constructions about Fourier 
series and repeat our previous questions in the context of the rotation algebras 
C* (A). 

Let us first introduce some interesting subalgebras of C*(A) that are easier to 
work with than the full C* -algebra. 

To avoid convergence questions, one often resorts to the algebra of absolutely 
convergent series of time-frequency shifts: 


s/{A) = \ A G &(L 2 (R d )): A = £ a x n{X), a G £ l (A) 

\ A gA 

If A G si (A), the sum of time-frequency shifts converges absolutely in the operator 
norm on L 2 (R d ). Note that this in complete analogy with the procedure of Fourier 
series. 

Since the coefficients of an operator in C*(A) are unique, we may endow si (A) 
with the norm 


By Lemma 5.38(2), this is indeed a norm on si (A). 

By (5.33) and (5.34) si (A) is a Banach algebra embedded in C*(A). Since there 
are no convergence issues, si (A) might be called the “lazy man’s rotation algebra.” 

As a variation we may also consider algebras of weighted absolutely conver¬ 
gent series of time-frequency shifts. If v is a submultiplicative weight on A, we 
introduce 

stf v (A) = | A e ^(L 2 (R d )) : A = y a x n(X), a G i\{A) 

( A gA 


with norm 

l|A|k = N 4 . 

In noncommutative geometry one considers the smooth noncommutative torus 


^{A) = lAe®(L 1 (WL d ))-.A = J j a x n(X), |a A | = + |A|)- W ), W > 0 

[ AeA 

If we write v* for the polynomial weights v s (z) = (1 + \z\) s , then clearly the smooth 
noncommutative torus is the intersection 

s£o(A) = f|k s ( A )- (5-35) 

s>0 

The analogy between Fourier series and series of time-frequency shifts leads 
to the natural questions: Is there a version of Wiener’s Lemma for the subalge¬ 
bras of the rotation algebras? Is there a corresponding version of Wiener’s Lemma 
for twisted convolution? These questions will be answered affirmatively in the 
following. 


5 Wiener’s Lemma: Theme and Variations 


213 


In contrast to Section 5.2 (where we treated absolutely convergent Fourier series 
before convolution operators), we consider convolution operators first and then turn 
to absolutely convergent series of time-frequency shifts. 


5.3.3.3 Wiener’s Lemma for Twisted Convolution 

Given hG^(A) or h G i\ (A), define the twisted convolution operator acting on 

a G £ 2 (A) by 

c£a = htua. 

By Young’s inequality, maps l p (A) into l p (A). Likewise one can use convolution 
from the right and define a twisted convolution operator a —> a \\ h. The results are 
the same. 

We now have the following noncommutative counterpart of Theorem 5.18 [118]. 


Theorem 5.42. Fix a weight v on A that satisfies the GRS condition. Assume that 
h G £\{A) and that is invertible on l 2 (A); then h is invertible in the algebra 
(l\(A), tjy\) and there exists a gG^J(A) such that (C^) -1 = Cg. As a consequence, 
is invertible on all £ P (A) simultaneously. 

Proof. Although the formulation is identical to that of Theorem 5.18 (we have 
only replaced * by \\\), the proof is radically different, because we have lost 
commutativity and we can no longer use Fourier series. Instead we will use the 
results on matrix algebras from the previous section. The following proof is taken 
from [119]. 

Let us interpret as a matrix acting on £ 2 {A) and find its entries: 

C£a= £ h x a,_ x e~ 2 * iXp( y^ 

AeA 

= X h v .^e~ 2Ki ^-^. 

lie A 

Thus, the matrix M of has the entries 

M vtl =h v ^e~ 2Ki ^-^. 

Clearly, M is convolution-dominated. Moreover, since 

sup |M V)V _p | = |hp |, 
veA 


we find that Mg^ and that 


(5.36) 
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Since ^ is inverse-closed in d$(£ 2 (A)) and M = is invertible on £ 2 (A), 
Theorem 5.30 guarantees that (C^) _1 = M~ l E 

It is left to show that M~ l is again a twisted convolution operator. Let g E £ 2 (A) 
be the unique element such that C^g = <5o. As in the proof of Theorem 5.18, we 
argue that Cg is the inverse of C^. 

The (twisted) convolution operator Cg is certainly defined on the dense subspace 
£°(A) = {a : suppa is finite }, and maps £°(A) into £ 2 (A). Then for all a e £° (A) 

C h( c 'g- M “ 1 ) a = h ^ (glU a) = a -a = 0. 

Since we have C\ = M~ l on the dense subspace £° (A ), the matrix of Cg coincides 
with M, and so (5.36) implies that gG^J(A). □ 

Note that the deduction of Theorem 5.42 from Baskakov’s Theorem 5.30 is 
rigorous. So far the only results that we have not proved completely are the results 
of Baskakov (or, equivalently, the operator-valued version of Wiener’s Lemma by 
Bochner and Phillips [24]). 

Since the matrix of CjJ is convolution-dominated and belongs to ^ v , 
Corollary 5.33 can be rephrased as the spectral invariance of twisted convolution 
operators. 

Corollary 5.43. Fix h E £\{A). Then the following are equivalent: 

1. Cjj is invertible on £ 2 {A). 

2. CjJ is invertible on £m(A) for some p E [1,°°] and some v-moderate weight m. 

3. Cjj is invertible simultaneously on £m(A) for all p E [1,°°] and all v-moderate 
weights m. 


5.3.3.4 Wieners Lemma for the Rotation Algebra 

The following result was proved in [118]. 

Theorem 5.44. Assume that v is submultiplicative and satisfies the GRS condition. 
If A E srf v (A) and A is invertible onL 2 {R d ), thenA~ l E srf v (A). 

Let us make plausible why the result follows from Wiener’s Lemma for twisted 
convolution (Theorem 5.42). Consider the mapping n \ £\{A) —> A) defined by 

rc(a) = X 

AeA 

By (5.32) and Lemma 5.38(2), n is an isometric ^-isomorphism between £\(A) 
(with respect to \\ ) and &f v {A) (with composition of operators). Thus, (^(A), tu) 
and #f v (A) are just different realizations of the same abstract Banach *-algebra. 
Clearly, Wiener’s Lemma for one realization should imply Wiener’s Lemma for the 
other realization. 
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The subtle part is in the hypotheses: In Theorem 5.42 we assume that C ^ is 
invertible on £ 2 (A), whereas in Theorem 5.44 we assume the invertibility of 
A = 7r(h) on L 2 (M J ). 

The technical part of the proof requires the spectral identity 

°«(i?)(*( h )) = <%(^(A))(Ch) ( 5 . 31 ) 

for all h G £\{A). This is a question of representation theory and requires some 
effort; see [118]. Once (5.37) has been proved, Theorem 5.44 follows from 

0^(A)(«( h )) = 0*1 ( h ) = 0«(^(A))( C h) = ■ 

The interaction between time-frequency analysis and noncommutative geome¬ 
try goes much further. The Banach algebras #/ v (A) and their spectral invariance in 
the noncommutative torus C* (A) play a central role in the theory and are the start¬ 
ing point for many developments. We refer to [55, 170, 171, 199-201, 206] for 
noncommutative geometry. A generalization of Theorem 5.44 to time-frequency 
shifts not supported on a lattice is given in [9]. 

Once more the GRS condition characterizes those weights for which Wiener’s 
Lemma for the rotation algebra holds. 

Corollary 5.45. g/ v (A) is inverse-closed in C*(A) and in 38(L 2 (M. d )) if and only if 
v satisfies the GRS condition. 

Again the necessity of the GRS condition follows from a counterexample. If v 
violates the GRS condition, then there are A G A and a > 0 such that v(nX) > e an 
for n > no. Let A = Id ^2 — e~ 5 7 r(A) G srf v (A). Then A is invertible in 38 {.L 2 {R d )) 
with inverse 

A -1 = £ e~ n5 n(X) n = £ e~ n5 y n n(nX ), 

n =0 n =0 

where the y n are phase factors, \y n \ = 1, resulting from the commutation rule (5.30). 
Then 

llA-'lk W = 

n =0 

and thus A -1 £ g/ v (A) whenever 8 < a. 

This counterexample shows that the noncommutative torus £/ v (A) with expo¬ 
nential weight v is not inverse-closed in C* (A). This observation was already made 
in [210] by means of a rather subtle argument. 

In the motivating section 5.2 we discussed the quotient rule and proved that 
C k ( T) is inverse-closed in C(T) (Lemma 5.1). Consequently, if / G C°°(T) and 
f(t)^ 0 for all t G T, then 1 // G C°° (T). The statement for the smooth noncommuta¬ 
tive torus srfoo(A) is completely analogous and is a celebrated result of Connes [54]. 

Corollary 5.46. If A G «gC(A) and is invertible on L 2 (M^), thenA~ x G «gC(A), i.e., 
A -1 = Xaga with rapidly decaying b. 
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Proof. Recall from (5.35) that «gC(A) = f) s>0 &/ Vs (A) for the polynomial weight 
v 5 (A) = (1 + |A|) 5 . Since A G «gC(A) C &f v fA) for all s > 0 and A is invertible 
on L 2 (M j ), Theorem 5.44 asserts that A -1 G £/ Vs (A). This is true for all s > 0, so 

A“ 1 G^co(A). □ 

In view of (5.35) the above statement is a simple corollary of Theorem 5.44. 
For further discussion and background we refer to [172] and [202]; an independent 
time-frequency proof was given by Janssen [147]. 


5.3.4 Convolution Operators on Groups 

Another variation concerns the form of Wiener’s Lemma for convolution operators. 
For this variation we replace the Abelian group Z by a general locally compact 
group Sf. 


5.3.4.1 The Basics 

Let us first mention the basic facts about locally compact groups [93, 139]. We 
write x,y, ... for the elements of £f. The group multiplication is (x,y) —> xy and 
is continuous by definition. Every locally compact group possesses a left-invariant 
measure, the Haar measure dx , which satisfies 

f fiax ) dx= [ fix) dx , for all a G Sf, 

for all continuous functions with compact support in Sf. As usual, the L p - norm 

/ \ l /p 

is defined as \\f\\ p = ( fy\f{x)\ p dx) and L p (f£) is the completion of the con¬ 
tinuous functions with compact support with respect to the p-norm. We write 
\U\ = Xu{x) dx for the Haar measure of the set U C . The convolution is 

(f*g)(x) = [ f(y)g(y~ 1 x)dy. (5.38) 

As in the case of Z, we may now study the convolution operator Cf with symbol / 
acting on a function h: Cfh = f *h. Young’s inequality holds for arbitrary locally 
compact groups; therefore, we have 

\\f*h\\ p <\\f\\ x \\h\\ P 

for all / G l}(f$) and h G L P (X?). Consequently, whenever /Gf 1 ^), the convo¬ 
lution operator Cf is bounded on L P (X?) for all p G [l,°o]. In particular, L l (&) is a 
Banach algebra with respect to convolution. It is commutative if and only if is a 
locally compact Abelian group (see Exercises). 
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The algebra L 1 (Sf) can be equipped with an involution. Let A : —> M + be the 
Haar modulus of £f, which is defined by the right translation invariance 
fyf(xa)dx = A (a~ l ) f^f(x)dx. Then f*(x) = f(x~ 1 )A(x~ l ) is an involution on 
L 1 (£f), and L 1 (Sf) becomes a Banach *-algebra. 

Again we may study the spectrum of Cf as an operator acting on To make 

the possible dependence of the spectrum on p explicit, we denote 

a'gg{L p ){Cf) = {A G C : C/ — Al not invertible on L p (&)} . 

Now the fundamental question is whether the spectrum is independent of p. As we 
have seen in Section 5.2, for the group Sf = Z this form of spectral invariance is 
equivalent to Wiener’s Lemma. 

What happens when we replace Z by more general groups; in particular, what 
happens for non-Abelian groups ? The answer to this question is a far-reaching 
generalization of Wiener’s Lemma and has led to very deep mathematics. 

We start with an abstract answer [12]. 

Lemma 5.47. The spectral invariance (5&(LP){Cf) = o^ L 2 ){Cf) for 1 < p < 00 
holds for all f G L 1 (£f) if and only ifG is amenable and symmetric. 

Thus, the appropriate formulation of Wiener’s Lemma on locally compact groups 
holds only for groups with certain properties. As a first insight we note that Wiener’s 
Lemma does not generalize to arbitrary locally compact groups. Its validity depends 
subtly on the group structure. 

But let us first explain the terms in Lemma 5.47. 

A locally compact group is called amenable if there exists a continuous linear 
functional m on L°°(£f) such that m( 1) = 1, m is positive (/ > 0 m(/) > 0), 

and m(T x f ) = m(f) holds for all / G L°°(£f) and xG^. One says that Sf possesses 
a translation-invariant mean on L°°(£f). Here T x is the translation operator on 
defined by T x f(y) = f(x~ l y),x,y G &. 

The group is called symmetric if the Banach *-algebra L 1 (Sf) is symmetric. 
Recall from Section 5.2.5 that this means that the spectrum of positive elements is 

p0SltlVe: <%(L1)(G**/)C[0 ,oo), V/ SL 1 (if). 

Both properties have been studied extensively and constitute independent 
directions of harmonic analysis. For many classes of locally compact groups it is 
known whether or not they possess these properties. Every compact group and every 
locally compact Abelian group is both symmetric and amenable. Roughly speaking, 
amenability and symmetry are properties that indicate the distance of a given group 

to the class of commutative groups or to compact groups. Currently there is no 
example of a locally compact group that is symmetric, but not amenable. It is con¬ 
jectured that every symmetric group is amenable. 

To get a feeling for these two concepts, let us verify that every compact group is 
amenable. Indeed, the Haar measure is an invariant mean on L°°(£f). Since L°°(£f) C 
l} (£f) for compact Sf, the invariance properties of the Haar measure imply that 

m(Txf) = [ T x f(y)dy = [ f(x~ 1 y)dy = [ f(y)dy = m(f), f e x € Sf. 
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To construct an invariant mean on R d , we start with the functionals 

m n {f)= vol(B„) _1 l f(x)dx, 

J B n 

where #„ = {v E : \x\ <n}. The functionals m n are in the unit ball of L°°(IR^)*. 
Since the unit ball of a dual Banach space is weak*-compact, the sequence m n pos¬ 
sesses at least one point of accumulation m. Since m n { 1) = 1, we also have m(l) = 1 
and so m / 0. This limit point is translation-invariant and positive, and is thus an 
invariant mean on L°°(£f). See Exercises. 

Next, every compact group is symmetric and every locally compact Abelian 
group is symmetric. However, symmetry is more subtle. The symmetry of compact 
groups requires some representation theory, and the symmetry of locally compact 
Abelian groups requires several properties of the Fourier transform. 

Many classes of groups are known to be amenable and symmetric. “Extremely 
noncommutative” groups, in particular all semisimple Lie groups including SL(2, M) 
or SL(2, C), are neither amenable nor symmetric. Not surprisingly, Wiener’s Lemma 
fails for these groups, and the spectrum of a convolution operator depends crucially 
on the domain space LP {). 


5.3.4.2 Convolution Operators on Groups of Polynomial Growth 

A complete structural characterization of all groups that are symmetric and amenable 
seems to be completely out of reach. Therefore, one restricts the investigation to cer¬ 
tain natural classes of locally compact groups and studies convolution operators on 
these groups. This is what we will do in the following. 

A natural condition to impose on a group is how the size of a sequence of neigh¬ 
borhoods grows. We will consider groups of polynomial growth. 

We say that is compactly generated if there exists a neighborhood t/CGof 
the identity element such that = U«=i U n , where U n = {u = U\U 2 ... u n : uj E U}. 
Such a neighborhood is called a generating neighborhood. 

A group is said to have polynomial growth if for some generating and rela¬ 
tively compact neighborhood U of the identity there exist positive constants C, d 
such that 

\U n \ < Cn d forallneN. 

Every compact group has polynomial growth, because the Haar measure is finite 
and we may take U = Sf as a generating neighborhood. Also, every compactly gen¬ 
erated, locally compact Abelian group possesses polynomial growth. This is easy to 
see for the elementary groups Sf = R d and Sf == Z d . Let U = [— 1, l] d C R d , then 
U n = [—n,n] d , and \U n \ = (2 n) d , and thus R d possesses polynomial growth. Like¬ 
wise Z d and every finitely generated Abelian group possess polynomial growth. The 
proof for general locally compact Abelian groups requires the structure theorem for 
such groups [139], but the proof can be carried out similarly. 

Every compactly generated group of polynomial growth is amenable. The 
construction of an invariant mean is similar to the one on R d . Let 
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m n {f) = \U n \ 1 f U nf(x)dx; then m n { 1) = 1 and m n is positive. It is not easy to 
show that every weak-* limit point of this sequence is an invariant mean on L°°(£f). 
See [148, 149]. 

In view of Lemma 5.47 the next question is whether every compactly gener¬ 
ated group of polynomial growth is symmetric. If this is true, then the version of 
Wiener’s Lemma for convolution operators will also hold for every group of poly¬ 
nomial growth. 

The study of the symmetry of locally compact groups has a long history. It 
was known early on that highly non-Abelian groups, such as the matrix groups 
GL(d,M),SL(d,M), and more generally the semisimple Lie groups, cannot be sym¬ 
metric. On the other hand, groups that are “almost Abelian,” such as nilpotent 
groups, are symmetric [164, 171]. 

So far the final touch in the quest for symmetric groups was obtained by Losert. 
He derived a deep structure theorem for groups of polynomial growth and as a con¬ 
sequence showed that such groups are symmetric [168, 169]. The following theorem 
is a milestone in noncommutative harmonic analysis and is one of the deepest gen¬ 
eralizations of the original Wiener’s Lemma. 

Theorem 5.48 (Losert [169]). Every compactly generated group of polynomial 
growth is symmetric. 

Combining all properties of groups of polynomial growth and applying 
Lemma 5.47, we obtain the spectral invariance of convolution operators. 

Corollary 5.49. Assume that is a compactly generated group of polynomial 
growth and f E L 1 ). Then for 1 < p < °°, 

G&(LP){Ch) = <%(L 2 )( C /0 • 


In other words, the spectrum of the convolution operator is independent of the 
L p -space, and the version of Wiener’s Lemma for convolution operators holds in all 
groups of polynomial growth. 

Next we consider weighted versions of Wiener’s Lemma on groups of polyno¬ 
mial growth. Recall that a locally bounded weight function v on a locally compact 
group is called submultiplicative if v(xy) < v(x)v(y) for all x,y G and symmetric 
if v(x ~ l ) = v(x) for all xG?. The weighted L 1 -space L\ (£f) is defined by the norm 

II/IIlJ = / \f(x)\v(x)dx. 

V JG 

If v is submultiplicative and symmetric, then L\ is a Banach *-algebra with 
respect to convolution and the involution f*(x) = f(x~ l )A(x~ l ) and is 

embedded in L 1 (£f). See Exercises. 

We have already seen in Section 5.3.1 that the GRS condition characterizes those 
weights for which a weighted version of Wiener’s Lemma holds. This pattern carries 
over to the much more difficult situation of groups of polynomial growth. 
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Theorem 5.50 ([91, 89]). Assume that Sf is a compactly generated group of 
polynomial growth. Then the following conditions are equivalent: 

1. The Banach *-algebra L \() is symmetric. 

2. Spectral invariance holds: 


0 L \(C h ) = <7 L 2 (C h ) for all h 

3. The weight v satisfies the GRS condition 

lim v(x n ) l ' n = 1 for all x e 9?. 

This theorem is much deeper than the original version of Wiener’s Lemma 
for weighted absolutely convergent Fourier series. The proof of the implication 
(2) => (1) requires the full structure theorem of Losert [169] and a detailed analy¬ 
sis and corresponding modifications of Ludwig’s proof that nilpotent groups are 
symmetric [171]. 

The necessity of the GRS condition for nondiscrete groups is similar in spirit 
to the counterexamples we have seen before, but it is tricky and requires Gaussian 
estimates for the heat kernel on . For discrete groups, the following argument 
proves the implication (2) => (3). 

The spectral invariance of (2) implies that (Ch) = r @(p) (Q). Choose h = 8 x 
for and consider the convolution operator C§ x f = 8 x *f. Since (5* * f)(y) = 
f(x~ l y) is the translation by x, C§ x is unitary on £ 2 (&) and thus r^^(C§ x ) = 1 for 
all iGi 

To treat (&), we first note that r^p ^(Cf) = rp (h), because 
\\h\\ el = \\h *S e ||,,<||Q|| M <||%i. 

So let us compute the spectral radius of 8 X in 

r t x (8 X ) = lim || & * • • • * 8 x \\f = lim ||&. \\f = lim . 

v n—<- v n — n—>°° 

Combining these observations, we find that 

v(x”) 1/n = r e i(8 x ) = = r^ 2) (C 5 J = 1 , 

which is the GRS condition. 

5.3.5 Pseudodifferential Operators 

We now turn from harmonic analysis to a topic in classical analysis and discuss 
pseudodifferential operators. These arise in partial differential equations, in quan¬ 
tum mechanics, or in wireless communications. So far we have dealt with sequences 
and matrices on the index set 7L d \ now we deal with functions on M, d and operators 
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acting on functions. Instead of Fourier series we use the Fourier transform 

M)= [f(x)e~ 2 ^dx. 

JR d 

In this section we will not prove any result. The goal is to tell the story of spec¬ 
tral invariance for pseudodifferential operators and put it in the context of Wiener’s 
Lemma. The sequence of statements is the same as in the previous sections. 


5.3.5.1 The Basics 

A pseudodifferential operator in the Kohn-Nirenberg calculus is formally defined 
by the integral 

K a f(x) = [ oix^f^y^dt; . (5.39) 

JR d 

The function a is called the (Kohn-Nirenberg) symbol of the operator. 

This integral is certainly well defined whenever a G L°° ( R 2d ) and / belongs to the 
Schwartz class (R d ). For more general symbols one may resort to a distributional 
interpretation. If /,g G ^(M^), then the function R(g,f)(x,%) = e~ 2nix '^g(x)f(%) 
belongs to y(R 2d ). Then we may take a tempered distribution a G y'(R 2d ) and 
define K 0 weakly by the formula 

{Kof,g) = (o,R(g,f)) for/,# G J?(R d ). (5.40) 

This weak interpretation defines a continuous operator from y(R d ) to 
Clearly, (5.40) extends the definition (5.39) to general symbols. (As in Section 5.2.6, 
we take the duality conjugate-linear in the second term.) 

If the symbol a depends only on the first variable, a(x, £ ) = m(x), then K c f(x) = 
f R d m(x)f(%)e 27rix ^ d% = m(x)f(x) is a just a multiplication operator. If a depends 
only on the second variable, where fl is the Fourier transform of 

a measure or distribution on R d , then K a f = p */ is a convolution operator or a 
so-called Fourier multiplier. 

Writing 


Kaf(x)= f <J(X,S)f d$ 

JR d 

= L (L ^)e 2ni ^ x ~ y] d()mdy 

= k(x,x-y)f(y)dy, 

JR d 

we may interpret K a f as a “time-dependent” convolution with kernel k(x,y). This 
is the reason why pseudodifferential operators are used to model time-varying con¬ 
tinuous systems in signal processing. See Section 5.3.6 for a detailed discussion. 

By using the time-frequency shifts M^T x f{t) = e 2m ^ mt f(t —x), we may write 
every pseudodifferential operator formally as a superposition of time-frequency 
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shifts as follows: 


K a f = [ [ d(ri,u)M n T- u fdudri 


Jm d Jm. d 


(5.41) 


To understand (5.41), assume that < 7 , a G L l ( R 2d ) and use the Fourier inversion 
formula J R d o(x^)e~ 27llu '^ ct(t], u)e 2nir,x dp. Now the following compu¬ 

tation is rigorous: 


K c f{x) = [ 

jR d 


= f(y)dyd^ 

= If ,,, d(r i'y~ x ) e2mv x f(y) d y d v 

JjR 2d 



a(rj , u ) e 2mri ' x f(u+x) dudr\ 


(5.42) 



For general symbols o G ^'(M 2 ^), (5.42) can be proved with a distributional argu¬ 
ment. The form (5.41) is sometimes called the spreading representation of K 0 . For 
a survey of the time-frequency approach to pseudodifferential operators, we refer 
to [111]. 

The theory of pseudodifferential operators is usually treated as a subject of 
classical “hard” analysis, as is exemplified in the treatises of Hormander [141] and 
Stein [217]. The spreading representation (5.41) suggests an alternative approach 
to pseudodifferential operators with time-frequency methods. The time-frequency 
approach has been particularly successful in the study of time-varying systems 
(Section 5.3.6) and is highly relevant for our discussion of Wiener’s Lemma. 

53.5.2 The Sjostrand Class 

The mapping a i—> K 0 is an example of a symbolic calculus (as discussed at the 
end of Section 5.2). Our first task is the identification of “nice” symbols. Here 
classical analysis and time-frequency analysis offer rather different answers. 
Whereas the classical Hormander classes are defined by differentiability properties, 
the time-frequency approach defines symbol classes via properties of the short-time 
Fourier transform. 

Fix a nonzero window function & G ^(M 2 ^) of 2d-variables, e.g., the Gaussian. 
The short-time Fourier transform (STFT) of a symbol a is 


V 0 G(z,O = (G-T z 0r(Q = (G,MPz^), 2,CeK M . (5.43) 

The STFT of a symbol is a function on R 4d . We say that a symbol a belongs to the 
Sjostrand class M 00 ’ 1 (M 2<i ) if 
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IMIm-.i = /,. SU P |(a-r z 4>H0|d£= [ sup \Va><j(z,0\d£ < (5.44) 

To obtain better control of the smoothness of the symbol, we will also consider 
the weighted Sjostrand class M^' l (R 2d ). We assume that v is a submultiplicative 
function on R 2d . Then the weighted Sjostrand class M^' l (R 2d ) is defined by the 
norm 

IklU - 1 = / M su p v(Qd£. 

It can be shown that the definition of 4/“'* (R 2rf ) is independent of the particular 
window function & as long as & belongs to a suitable space of test functions. Then 
different windows yield equivalent norms on My' 1 (R 2d ) [108, Thm. 11.3.7]. 

Note that if a G M 00 ’ 1 ( R 2d ) and z G M 2J is fixed, then (<j • T z O)^(Q = V®o(z , C) G 
L} (M 2 ^). This means that <7 coincides locally with the Fourier transform of an 
L l -function. At this stage one may already sense an analogy between Fourier 
series with i 1 -coefficients and the symbol class M 00 ’ 1 . This analogy should alert 
us for yet another version of Wiener’s Lemma. 

Before approaching Wiener’s Lemma, we first have to find two Banach algebras 
[a small one corresponding to the absolutely convergent Fourier series and a big one 
corresponding to C(T)]. This preliminary work was easy for convolution operators 
[Lemma 5.17) and for matrices [Lemma 5.28 and (5.23)]. For pseudodifferential 
operators the Banach algebra property is nontrivial and interesting in its own right. 

We first state the algebra property of the Sjostrand class [110, 213, 224]. The 
composition of two pseudodifferential operators K 0 and K T defines a product on 
the level of symbols via K G K T = K a OT . Likewise, taking the adjoint operator yields 
an involution on the level of symbols by (K c )* = K G *. Explicit formulas are avail¬ 
able [141], but are not necessary for the time-frequency approach. 

Theorem 5.51. If v is submultiplicative and <7,T G then K G K T = K GoT with 
a o t G Mf 1 ' 1 . In fact, My' 1 is a Banach *-algebra with respect to o. 

The following boundedness result has been proved many times [29, 108, 112, 
116,213]. 

Theorem 5.52. If a G M°° il (R 2d ), then K G is bounded on L 2 (M^) and \\K G \\ L 2^ L 2 < 
C\\o\\ M ^. 

When dealing with convolution operators, we identified the sequence h with the 
corresponding convolution operator Ch and thus obtained an embedding of i l (Z) 
into 3$(£ 2 ). In this spirit let us define Op (My' 1 ) as the set of all operators T from 
to y{R d ) that can be written as a pseudodifferential operator T = K G with 
a symbol a G My' 1 . Then the two previous results (Theorems 5.51 and 5.52) can be 
rephrased by saying that 

Op(AC 71 ) is a Banach *-subalgebra of &(L 2 (R d )) 

wi* norm ||^ <J || 0p(Mr ,i ) = ||cT|| Mr , 1 . 
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5.3.5.3 Wiener’s Lemma for Pseudodifferential Operators 

We have reached exactly the same point as in our discussions of convolution 
operators and matrix algebras. We study an interesting, noncommutative Banach 
algebra of operators with two striking properties: It is embedded into the C *-algebra 
of bounded operators on L 2 and it possesses a hidden ^-like structure. So the 
logical next question is whether this algebra Op(M° o1 ) is again inverse-closed in 
&(L 2 (R d )). This fundamental discovery is due to Sjostrand [214]. 

Theorem 5.53. If o E M 00 ’ 1 (M? d ) and K 0 is invertible on L 2 (M rf ), then K~ x = K T 
for some T E M°° A . 

Sjostrand’s proof works with a decomposition of the pseudodifferential operator 
K c y into small localized pieces and with Wiener’s Lemma for the matrix algebra ^ of 
Gohberg, Baskakov, and Kurbatov (Section 5.3.2). Our recent proof [112] applies 
time-frequency methods and structural Banach algebra arguments. It extends and 
sharpens Sjostrand’s result to the weighted case and clarifies the precise spectral 
invariance properties. 

Theorem 5.54 ([112]). Assume that v is a submultiplicative weight on M 2<i satisfy¬ 
ing the GRS condition lim^oo v{nz) l ^ n = 1 for all z E M? d . 

If G E M^’^M 2 ^) and K a is invertible on L 2 (M^), then K~ l = K T for some 
TEM/ 1 . 

Theorem 5.55 ([112, 114]). Assume that v is submultiplicative on M 2 ^. Then 
Op (M^T’ 1 ) is inverse-closed in 3${L 2 ) if and only ifv satisfies the GRS condition 

lim v(nz) l ^ n = 1 


for all z E R 2d . 

Once again the counterexample follows the pattern established in Section 5.3.1. 
If for some zo = fio) £ M 2d we have lim^oo v(nzo) l ^ n = e a > 1, then we consider 
the operator Id L 2 — e~ d n{zf) for 8 < a. Its symbol o(x,%) = 1 — e~ s e lKl ^ ri0 
is in Mf' l (R 2d ), but the symbol of the inverse operator Yf=o e ~ SnK i^o) n is not in 
Mf :l (R 2d ). See Exercises. 

Further generalizations were obtained in [120]. Let us also mention that the 
theory of the rotation algebra discussed in Section 5.3.3 and the time-frequency 
analysis of pseudodifferential operators are related: Theorem 5.44 can be derived 
from Theorem 5.54 [115]. 


5.3.5.4 Spectral Invariance 

As with convolution operators, pseudodifferential operators with “nice” symbols 
are bounded on a much larger class of function spaces, the so-called modulation 
spaces. In time-frequency analysis these are defined by properties of the short-time 


5 Wiener’s Lemma: Theme and Variations 


225 


Fourier transform. Let cp(t) = e nt ' l be the Gaussian on R d , 1 <p,q<°°, and m be a 
v-moderate weight. Then the modulation space Mfifi is defined as the completion of 


the space of finite linear combinations 


{/: / = Y!j =i Cj7t(zj)(p,Cj e C,zj e M 2d | 


with respect to the norm 


'ML 



For p = ooor^ = ooa small modification of the definition is necessary. For a detailed 
exposition of modulation spaces, see [108, Chap. 11-13], for a historical account 
with an extensive list of reference, see [84]. 

The following general boundedness result for pseudodifferential operators on 
modulation spaces is the analogue of Young’s inequality for convolution 
(Lemma 5.17). See [108, Chap. 14] or [112, 120]. 

Theorem 5.56. If a G My :l (R 2d ), then K 0 is bounded on all modulation spaces 
Mfifi with l < p,q < 00 and v-moderate m. 

As a consequence of Theorem 5.53, we obtain the complete spectral invariance 
for pseudodifferential operators. 

Corollary 5.57. Assume that v satisfies the GRS condition and G G M^’ 1 . Then the 
spectral invariance 

= G ^(L 2 )(K 0 ) 


holds for every p,q G [1, °°] and every v-moderate weight m. 

The argument is similar to the proof of Theorem 5.19 and does not require new 
ideas. One shows that K 0 is invertible on Mfifi if and only if ( K 0 )* is invertible on 
Mfifi and then uses duality and interpolation of modulation spaces. 

To sum up, the spectrum of a “nice” pseudodifferential operator does not depend 
on the space on which it acts. 

A Connection to Classical Pseudodifferential Operators. Recall that a symbol 
a belongs to the Hormander class Sq 0 if and only if d a G G L°°(M 2J ) for all multi¬ 
indices a > 0. Toft [225] observed that the Hormander class can be written as an 
intersection of modulation spaces. If v s (Q = (1 + |CI) 5 , then 

s>0 

Since Op(M^ 1 ) is inverse-closed in ^(L 2 ), we find that the intersection plsx )^? 1 
is also inverse-closed in ^(L 2 ). Formulated explicitly, this is a famous result of 
Beals [17] and probably the earliest result on spectral invariance in the theory of 
pseudodifferential operators. 

Corollary 5.58. If g G Sq 0 and K 0 is invertible on then K a 1 = K T for some 

T G Sq 0 . 
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5.3.5.5 Almost Diagonalization of Pseudodifferential Operators 

The results about pseudodifferential operators may look fairly technical. In this 
section we would like to make them a bit more plausible by using time-frequency 
analysis. Since a pseudodifferential operator K 0 is defined as a superposition of 
time-frequency shifts, it is almost obligatory to ask how K a acts on a time-frequency 
shift 7r(z). This idea culminates in the insight that pseudodifferential operators are 
almost diagonal with respect to time-frequency “bases.” 

For the formulation of this result we need an additional concept. Fix a lattice 
A = AZ 2d C M. 2d and a “nice” basis function g. Ideally we choose the Gaussian 
e~ 7lt ' t , but any nonzero function satisfying J M 2 ^ | (g, n(z)g) \ v(z) dz < 00 works. Such 
a g belongs to the modulation space My’ 1 ( R d ). We say that the set {7r(A)g : A G A} 
is a Gabor frame if there exist constants A,B > 0 such that 

A||/||i< I \(f,n(X)g)\ 2 3 <B\\f\\l for all f£L 2 (R d ). 

General frame theory and the explicit construction of Gabor frames are discussed 
in detail in Ole Christensen’s Chapter 1. The construction of Gabor frames with 
a basis function in My’ 1 (M^) for rapidly increasing weights is more difficult and 
in fact requires Wiener’s Lemma for twisted convolution! See [85, 118] and 
[108, Chap. 13]. 

The following characterization of the Sjostrand class is the key to understanding 
many properties of pseudodifferential operators and to proving the main results in 
Section 5.3.5. 

Theorem 5.59. Assume that g G My’^M^), g 0, and that {/r(A)g : A G A} is a 
Gabor frame for L 2 (M^). Then the following properties are equivalent. 

1 . a eMv’ 1 . 

2. There exists a continuous function H £ L\, { M 2 ^) such that 

\(K G (n(z)g),7i(w)g)\<H(w-z) for all w,z£ M 2d . (5.45) 

3. There is ah £ ^J(A) such that 

\(K G (n(n)g),n(X)g)\< h(f -ju) for all £ A. (5.46) 

Theorem 5.59 shows that the time-frequency shift n(z) is almost an eigenvector 
of K a and that K a is almost diagonalized by frames of time-frequency shifts. 

Let us rewrite (5.46) and connect it to the topic of matrix algebras. Let M(a) be 
the matrix indexed by l? d with the entries 

M(o) k i = (K G (n(Al)g),n(Ak)g ), A =Ak,p =Al G A . (5.47) 

Then (5.46) states that \M(a)u\ < h(A(k — l )), and thus the matrix of K 0 is 
dominated by convolution with h = h o A. In the light of Section 5.3.2 we may recast 
Theorem 5.59 in a more compact way. 
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Theorem 5.60. Assume that {/r(A)g : A E A} is a frame for L 2 (M rf ) and 
g E Then a symbol a belongs to the weighted Sjostrand class Mf' 1 if 

and only if M(g) belongs to the matrix algebra *6 - 7 V . 

G E M 7’ 1 (M 2J ) ^ M(g) E • 

Furthermore, it can be shown that ||(t|| m oo,i and ||M(a)||<^ v are equivalent norms on 
the symbol class AfJT’ 1 . 

By means of Theorem 5.59, the algebra property of M^’ 1 and Wiener’s Lemma 
for pseudodifferential operators are now much easier to understand and follow from 
the corresponding properties of the matrix algebra c € v . 

Assume that <7, T E and thus M(ct),M(t) E c S > v . The composition of opera¬ 
tors corresponds to matrix multiplication. Consequently, K aOT = K G K T corresponds 
to the matrix M(g)M(t) E ^ and thus Go t E AC’ 1 . 

Likewise, the inverse of K a corresponds to the inverse of M(g). Since is 
inverse-closed in M(g)~ 1 E and by Theorem 5.59 the symbol T of 

K~ l = K t is in AC’ . The rigorous proof requires much more work, because strictly 
speaking, M(g) is not invertible. It possesses a nontrivial kernel and is invertible 
only on a certain subspace of £ 2 . See [112] for the precise details. 

Our main point here is that Wiener’s Lemma for matrix algebras enters crucially 
and directly in the proof of Wiener’s Lemma for pseudodifferential operators. 


5.3.6 Time-Varying Systems and Wireless Communications 

In Section 5.2.6 we used discrete time-invariant systems as a motivation for convo¬ 
lution operators on Z, and in Section 5.3.2 discrete time-varying systems served as 
a motivation to study matrix algebras. In both cases, we assumed a “digital” world, 
and a signal was understood to be a sequence of numbers. The “physical” world, 
however, is continuous, and therefore we now turn to the discussion of “analog” 
signals and continuous time-varying systems. For the mathematician, signals are 
functions on M or R d , and systems are operators on L 2 (M J ). 

The goal of this section is to discuss how the results about pseudodifferential 
operators of Section 5.3.5 can be applied in an engineering context. 


5.3.6.1 Time-Varying Systems 

Let us make a simple model of a time-varying system as it is used in mobile 
communications. 

In Fig. 5.1a signal / is transmitted by an antenna to a cellular phone in a moving 
tramway car. The signal is an electromagnetic wave and the propagation of / is 
governed by the wave equation. Thus, we are forced to work with analog signals. 

During transmission the signal is distorted and transformed by various effects, of 
which we model two main effects: 
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(a) The signal is reflected at various obstacles and the signal arrives at the 
receiver with a delay caused by different path lengths. Formally, the received signal 
/ is a weighted superposition of time shifts of the transmitted signal / with some 
weight V : 

f(t)= f V(u) • • • f(t+ u)du. 

JR d 

The weight V depends on the physical characteristics of the transmission, such as 
the path length or the absorption at reflectors. 

(b) If the sender and receiver are in motion with respect to each other, then the 
Doppler effect will result in a frequency shift proportional to the relative velocity of 
sender and receiver. Since M%f( t) = /(T — t ;), the received signal / is a superposi¬ 
tion of modulations (= frequency shifts) with some weight W : 

/(?) = f W(ri)---e 2nint f(t)dT]. 

JR d 

The differing path lengths and the Doppler effect are illustrated in Fig. 5.1. 



Fig. 5.1: Signal distortion caused by variable path length and by the Doppler effect. (Courtesy of 
Gerald Matz, Technical University of Vienna.) 


Combining the two types of distortion, we find that the received signal / is a 
superposition of time-frequency shifts, which we write as 

/= [ a(rj,w)£ 27n?7 ' r f(t + u)dudr ]. 

jR 2d 


(5.48) 
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The weighting factor a is the spreading function of the time-varying system. It 
depends on the physical characteristics of the transmission and is usually estimated 
with statistical methods. 

Comparing with (5.42), we understand the awkward notation for the weighting 
factor. The distortion / —» / is nothing more than the pseudodifferential operator 
K 0 with the Kohn-Nirenberg symbol a, and thus 

f = K c f= f o(x,£)<?**■*<!£. 

JR d 

Although mathematicians and engineers study the same object, there is a big 
difference between pseudodifferential operators in analysis and time-varying 
systems in engineering. Pseudodifferential operators are used for the construction 
of approximate inverses (parametrices) of partial differential operators and for the 
study of regularity properties in partial differential equations. Thus, all conditions 
and arguments in analysis involve derivatives and smoothness. 

In engineering, however, differentiability properties do not play a role. The 
spreading representation (5.48) rather suggests time-frequency methods as a tool 
for the investigation of time-varying systems. 

Symbol Classes in Mobile Communications. Let us first discuss the question of 
symbol classes. For physical reasons there are a maximum Doppler shift Vo and 
also a maximum time delay To; consequently, the spreading function a is compactly 
supported in the rectangle [—Vo, Vo] x [0, To] . A time-varying system with compactly 
supported spreading function is called underspread [183]. Such operators play an 
important role in communication theory. 

Concerning the nature of <7, the modeling of engineers is at odds with the math¬ 
ematician’s need for rigor. The standard assumption of engineers is that <7 E L 2 . 
This assumption is doubtful, because then by Pool’s theorem [197] K 0 must be a 
Hilbert-Schmidt operator. This is definitely a problem, because the class of Hilbert- 
Schmidt operators excludes the distortion-free channel (the identity operator) and 
time-invariant channels (convolution operators). Furthermore, a Hilbert-Schmidt 
operator cannot have a bounded inverse on L 2 (M), or in engineering terms, the 
recovery of / from / is ill-posed, and the equalization will be extremely unstable. 

To make a more satisfactory model for the symbols arising in wireless communi¬ 
cations and time-varying channels, we follow Strohmer [219]. We keep the assump¬ 
tion that suppa is compact, but we admit a to be a distribution in M°°(R 2d ). This 
means that a is a tempered distribution with bounded short-time Fourier transform. 
For instance, if <7 = 8 (the point measure at 0), then K a = Id L 2 . 

Lemma 5.61. Assume that a E M°°(M 2<i ) and that suppa is compact in some hall 
B(0,R) = {ze M 2J : \z\ < R}. IfOe y{R 2d ) and supp & C B(0,R), then 

f sup |V*c7(z,0|v(CMC <°o 

J zSR2d 


(5.49) 
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for any nonnegative locally bounded function v. Consequently, a G M^ :l (M 2d ) for 
any weight v. 

Proof Observe that (a • T z O)^(Q = (a *M_ Z <£>)(£) and hence for fixed z has its 
support in B( 0,2 R). We find that 

VcpCT(z,C)=0 for \z\ > 2R. 

Since VqO is bounded, (5.49) follows. □ 


5.3.6.2 Transmission of Information by OFDM 

Next we explain how pseudodifferential operators enter in the process of data 
transmission. 

1. The data. Given is a discrete set of data (digital information), c k i G C, where 
^,/GZ. Usually the numbers c k i are taken from a finite alphabet, either c k \ G { — 1,1} 
or c kl G {i l (l + i) : t — 0,1,2,3}. The parameter k indicates the time when the coef¬ 
ficient cm is transmitted, and the parameter / labels the frequency band over which 
the coefficient is sent. For fixed /, we may think of the sequence {c k i : k G Z} as a 
“word” that is sent over the Zth frequency band. For fixed k , the set {c k i : l G Z} is 
the symbol group that is transmitted at time k. 

2. Digital-analog conversion. In the first step the digital information c k i is con¬ 
verted to an analog signal. The data c k i serve as coefficients in a series expansion. 
Fix a suitable pulse g\ then the transmitted analog signal is 

m = X ( X c kl e 2ni P lt )g(t - ak) = X c kl M pi T ak g(t). (5.50) 

fcez v /eZ 7 fc,/ez 

A series of this form is called a Gabor series. (Of course, in practice the sum is 
finite.) 

It is easy to understand why Gabor series are a convenient way to transform a 
discrete set of data into an analog signal. The symbol group transmitted at time ak 
is J Zi e z c kie 2m P lt g{t ~ otk) m , this is the Fourier series of the coefficients c^i and can 
be calculated easily with a fast Fourier transform (FFT). For fixed /, the Zth word 
is transmitted as f(t) = M^{j^ keZ c k ig(t - ak)). If suppg C [-j3'/2, j3'/2], then 
supp// C [/3Z — j3'/2, j3Z + /3'/2]. If we choose /3' < j3, then each word is transmitted 
on a different frequency band. 

This method for the simultaneous transmission of several independent data sets 
is called frequency-division multiplexing. If the time-frequency shifts MpiT ak g form 
an orthogonal set, then one speaks of orthogonal frequency-division multiplexing, 
OFDM in short. 

In industrial applications the pulse g is usually chosen to be a characteristic 
function g = X[o,a'] ot f < a. Such a pulse achieves a good separation of con¬ 
secutive symbol groups and works optimally in stationary environments. However, 
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in nonstationary environments the Fourier transform of the characteristic function 
X[o,a'] decays slowly, and adjacent frequency bands are not separated adequately. 

For nonstationary time-varying environments better pulse shapes are required. 
The ideal pulse is compactly supported in time and frequency, and its time-frequency 
shifts Mpi T a kg are mutually orthogonal. These properties exclude each other 
because of the uncertainty principle. Thus, it becomes a relevant mathematical prob¬ 
lem to construct appropriate pulse shapes that are compatible with the uncertainty 
principle. See [26, 184, 218, 220] for a small sample of papers by both engineers 
and mathematicians. Let us mention that Wiener’s Lemma both for Fourier series 
and for matrix algebras is used implicitly in several pulse-shaping constructions. 

Figure 5.2 shows a formal representation of the transmitted signal in the 
time-frequency plane. Each coefficient c^i belongs to a different cell in the 
time-frequency plane; the goal of pulse design is to ensure that the cells are well 
separated from each other. 



Fig. 5.2 : Each coefficient cm of the transmitted signal occupies a cell in the time-frequency plane. 


3. Transmission of / and signal distortion. Next the analog signal generated 
from the digital information is transmitted by a sender and, in the course of its 
propagation, undergoes various distortions. Let us emphasize that the transmission 
is a physical process subject to the wave equation. This is not a phenomenon in a 
discrete space, and any discretization is an approximation that has to be justified. 
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As explained above, the distortion is described by a pseudodifferential operator. 
The received signal is of the form 


/ = K(jf = [ d(u,7])MriT- u fdudT], 

m d 

and the symbol may be assumed to be in a weighted Sjostrand class M^’ 1 (M 2J ). 

4. Analog-digital conversion. In terms of the coefficients c k h the received 
signal is 

f = Kof= X cufKaiMpTMg). 
k'j'eZ 

The goal is now to recover the coefficients c k i from /. This task is usually 
approached by taking correlations, i.e., by taking inner products with time-frequency 
shifts of g (or with other known pulse shapes). After taking correlations, we obtain 
the sequence 

ykl = {fMplTakg) 

= X c k',l'( K o( M pi'Ta k 'g),MpiT ak g ). 

k',l'eZ 

Let y be the vector with components y k i and M(g) be the matrix with entries 


M(o)ki,k'l' = (KoiMpy T ak fg) , MpiT aJc g ). (5.51) 

Then we may write the relation between the original data sequence c and the output 
sequence y as an infinite system of linear equations 

M(a) c = y. (5.52) 

The matrix M(a) is the matrix of K a with respect to a set of time-frequency shifts. 
In the context of data transmission it is called the channel matrix. Its entries describe 
the interference between different cells in the time-frequency plane caused by the 
distortion operator K 0 . 

5. Equalization. Finally, we have to recover the original data c. This amounts to 
solving the system (5.52) or explicitly 

c = M(c7) _1 y. 

In order to write the inverse of M(a) in a meaningful way, we have to assume that 
M(a) is an invertible matrix. The usual assumptions are that K 0 is invertible and that 
the set of time-frequency shifts MpiT ak g forms a Riesz basis (or orthogonal basis) 
for some subspace of L 2 . (Otherwise the input-output relationship / —> / = K a f 
would be ill-posed and the reconstruction of the data c k i from the received signal / 
would be unstable.) 

In general, the solution of the system (5.52) or the inversion of the channel 
matrix poses a challenging computational problem. In practice this is not a problem, 
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because it is generally assumed that the channel matrix M(a) is diagonal. Thus, 
(K(j(MpirT a tfg),MpiT a kg) = 0 for (kj) {k',l'). Consequently, the original data 
are simply given by 

cu = (K<j (Mp i T aJi g ), Mp i T aJc g ) _ V&/ • 

This assumption of diagonality seems truly miraculous. There is no reason why 
the channel matrix should be exactly diagonal. At first glance it is really amazing 
that the OFDM method should work for time-varying systems. 

Nevertheless, the engineering intuition is correct, and the described equaliza¬ 
tion method works in practice. The mathematical reason is, once again, Wiener’s 
Lemma. 

Proposition 5.62. Assume that a has compact support and M(a) is invertible on 
L 2 {R d ), thatv(x , <^) = ^(M+I^l for a > 0,0 < b < 1, and that g G Then 

\M{q)w \< Ce- aM+|/ - /,|)6 , e Z, 


and likewise 


\{M{o)- l ) klk , v \ <C'e- a (l*-*'l + l*-*'l>‘, V*,*',// e Z. 

Proof. By Lemma 5.61 the symbol describing the channel is in every M^ :l (R 2d ). 
Consequently, by Theorem 5.59 the matrix of K c is almost diagonal and its entries 
are dominated by |M(a)^/^///| < h(k — k' ,/ — /') for some sequence h G In 
particular, |M(cr)^/^///| < h(k — k ',/ — /') < Cv(k — k f , l — l')~ l . Since the subexpo¬ 
nential weight v satisfies the GRS condition, Wiener’s Lemma for pseudodifferential 
operators (Theorem 5.53) implies that the same decay property holds for the inverse 
matrix. □ 

Proposition 5.62 says that the channel matrix and its inverse are almost diagonal 
with respect to arbitrary subexponential weights. Consequently, it suffices to use the 
diagonal (and possibly a few side-diagonals) for matrix computations. In particular, 
the inverse of M(a) is almost a diagonal matrix. Thus, Proposition 5.62 can be taken 
as a mathematical justification of the engineering practice. 

At the core of this justification is Wiener’s Lemma for pseudodifferential 
operators. 
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Exercises for Section 5.3 

1. Show that the weight function v(k) = e a W h a, b> 0, satisfies the GRS condition 

if and only if b < 1. Show that the function v(k) = is submultiplica- 

tive and satisfies the GRS condition. 

2. Formulate and prove a version of the theorem of Wiener-Levy (Theorem 5.16) 
for the algebra of weighted absolutely convergent Fourier series ^(T). 

3. (a) Show that the space of matrices defined by the Schur-type conditions (5.27) 
is a Banach algebra with respect to matrix multiplication. 

(b) Prove the continuous embeddings C C srf™ of (5.28) and give 
examples to show that the inclusions are proper. 

4. Let J? = ^ ^ (consisting of d x d blocks). Given a lattice A = AI? d for 
some invertible matrix A, define the adjoint lattice A° by 

A° = /(A T )- l Z 2d , (5.53) 

where A r is the transpose of A. Show that a time-frequency shift n{z) commutes 
with all 7r(A), A G A, if and only if z G A° [88]. 

5. Given are a lattice A C R 2d and the exponential weight v(A) = Find a 
sequence hG^J(A) such that the twisted convolution operator is invertible 
on £ 2 (A) with inverse Cg, but g 0 £\{A). 

6. Show that the set {C^ : h G l\ (A)} is a closed *-subalgebra of ^ v . 

7. Prove statements (1) and (2) of Lemma 5.32. 

8. For a locally compact group , show that L 1 (£f) is commutative if and only if 

is Abelian. 

9. Let m n (/) = vol(^) -1 f Bn f(x) dx for / G L°°(R d ). Show that any limit point m 
of the sequence m n is an invariant mean on L°°( M. d ), i.e., m(l) = 1, 
\m(f) | < H/lloo, f > 0 ^ m(f ) > 0, and m(T x /) = m(/) for all / G L°°(M J ) 
andx G 

10. Let ^ be a locally compact group and v be a submultiplicative, even weight 
on Show that Lj(£f) is a Banach *-algebra with respect to convolution and 
the involution /* (x) = /(v -1 ) A (v -1 ). 

11. Let 

a. Describe the associated pseudodifferential operator . 

b. Show that a G M™ lX ( R 2d ) if and only if a G ( I ? d ). 
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